Refactor to decouple availability from table schemas #165

michael-mclawhorn · 2017-02-08T20:09:24Z

No description provided.

archolewa · 2017-02-08T20:27:22Z

😱

QubitPi · 2017-02-08T21:53:16Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/PreResponseDeserializer.java

+
+        columns.addAll(
+                Streams.stream(metricEntries)
+                        .map(entry-> new MetricColumnWithValueType(entry.getKey(), entry.getValue().asText()))


Not sure if this matters but there is a space missing between entry and ->

QubitPi · 2017-02-08T22:08:18Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/Column.java

+     * @return name
+     */
+    public String getName() {
+        return this.name;


this might be unnecessary.

QubitPi · 2017-02-08T22:20:23Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/LogicalTableSchema.java

+                tableGroup.getApiMetricNames().stream()
+                        .filter(apiMetricName ->  apiMetricName.isValidFor(granularity))
+                        .map(ApiMetricName::getApiName)
+                .map(name -> new LogicalMetricColumn(name, metricDictionary.get(name)))


Might be easier to read if this is indented by 2 tabs more

QubitPi · 2017-02-08T22:38:42Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTableSchema.java

+            @NotNull Map<String, String> logicalToPhysicalColumnNames
+    ) {
+        super(columns);
+        this.granularity = timeGrain;


this might be unnecessary

QubitPi · 2017-02-08T22:39:12Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTableSchema.java

+        this.granularity = timeGrain;
+
+        this.logicalToPhysicalColumnNames = Collections.unmodifiableMap(logicalToPhysicalColumnNames);
+        this.physicalToLogicalColumnNames = Collections.unmodifiableMap(


this might be unnecessary

QubitPi · 2017-02-08T22:40:05Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTableSchema.java

+     * @return true if this table supports this column explicitly
+     */
+    public boolean containsLogicalName(String
+            logicalName) {


String logicalName can be put on the same line

QubitPi · 2017-02-08T22:42:17Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/Availability.java

+        if (result != null) {
+            return result;
+        }
+        return Collections.emptyList();


The two return's can be combined to return result == null ? Collections.emptyList() : result

QubitPi · 2017-02-08T22:47:07Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/util/ImmutableWrapperMap.java

+ */
+public class ImmutableWrapperMap<K, V> implements Map<K, V> {
+
+    Map<K, V> map;


Just curious, if we implement Map, does it make sense to have this? Looks like HashMap doesn't do it this way http://grepcode.com/file_/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java/?v=source

michael-mclawhorn · 2017-02-09T01:49:48Z

@archolewa I'll restructure the commits tomorrow.

michael-mclawhorn · 2017-02-09T23:51:44Z

There was so much interdependency I couldn't even figure out how to simplify via several commits.

garyluoex

Mostly just style change and questions. Thanks!

garyluoex · 2017-02-10T18:04:06Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/PreResponseDeserializer.java

@@ -219,8 +219,7 @@ private ResultSet getResultSet(JsonNode serializedResultSet) {
     *
     * @return ZonedSchema object generated from the JsonNode


Need to fix comment here from ZonedSchema to ResultSetSchema

garyluoex · 2017-02-10T18:04:51Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/PreResponseDeserializer.java

@@ -229,21 +228,25 @@ private ZonedSchema getZonedSchema(JsonNode schemaNode) {
        );

        //Recreate ZonedSchema from granularity and timezone values


Need to fix comment here

garyluoex · 2017-02-10T18:36:41Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+/**
+ * The schema for a result set.
+ */
+public class ResultSetSchema extends BaseSchema implements Schema {


Might not need to implement Schema since BaseSchema implements Schema already?

garyluoex · 2017-02-10T19:46:00Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSerializationProxy.java

        Map<String, Object> schemaComponents = new HashMap<>();

-        String timeId = (schema instanceof ZonedSchema) ?
-                ((ZonedSchema) schema).getDateTimeZone().getID() :
-                SYSTEM_CONFIG.getStringProperty(SYSTEM_CONFIG.getPackageVariableName("timezone"), "UTC");


So we are completely discarding the timezone configuration property?

Time zone is now riding along via the granularity rather than in the schema top level.

garyluoex · 2017-02-10T20:12:59Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/config/table/BaseTableLoader.java

-                definition.getGrain(),
-                definition.getLogicalToPhysicalNames()
-        );
+         new LinkedHashSet<>();


Remove extra code.

garyluoex · 2017-02-13T16:47:24Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/web/RequestUtils.groovy

+                dataSourceName,
+                DAY.buildZonedTimeGrain(DateTimeZone.UTC),
+                [] as Set
+                ,


Can we move the comma back?

garyluoex · 2017-02-13T16:47:34Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/web/RequestUtils.groovy

+                dataSourceName,
+                DAY.buildZonedTimeGrain(DateTimeZone.UTC),
+                [] as Set
+                ,


Can we move the comma back?

garyluoex · 2017-02-13T16:47:58Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/web/RequestUtils.groovy

+                dataSourceName,
+                DAY.buildZonedTimeGrain(DateTimeZone.UTC),
+                [] as Set
+                ,


Can we move the comma back?

garyluoex · 2017-02-13T16:48:17Z

...re/src/test/groovy/com/yahoo/bard/webservice/web/SketchIntersectionReportingResources.groovy

+                "NETWORK",
+                DAY.buildZonedTimeGrain(UTC),
+                columns
+                ,


Can we move the comma back?

garyluoex · 2017-02-13T16:53:32Z

fili/rfc/rfc-composite-tables/base_notes.md

@@ -0,0 +1,88 @@
+Support Union Data Source Well


Are we going to merge these rfc notes?

cdeszaq

Part 1

And this will definitely need a CHANGELOG entry (or possibly many)

cdeszaq · 2017-02-10T21:30:45Z

fili-core/src/main/java/com/yahoo/bard/webservice/application/AbstractBinderFactory.java

@@ -167,6 +167,7 @@
            SYSTEM_CONFIG.getPackageVariableName("loader_scheduler_thread_pool_size"),
            LOADER_SCHEDULER_THREAD_POOL_SIZE_DEFAULT
    );
+    public static final String SYSTEM_CONFIG_TIMEZONE = "timezone";


In places where the constant is the property key, we've generally taken the approach of including _KEY in the constant name, so that there's no confusion and expectation that this constant holds the actual timezone, rather than the key.

cdeszaq · 2017-02-10T21:35:07Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

        // The data source is the table directly, since there is no nested query below us
-        DataSource dataSource = new TableDataSource(table);
+        DataSource dataSource = new TableDataSource((ConcretePhysicalTable) table);


How long do we expect this code to live like this? I'm leery of the possibly unsafe cast w/o a check / log nearby

It looks like we are casting to a ConceretePhysicalTable in each of these building methods. Should we just update the signatures of tehse methods to take a ConcretePhysicalTable instead, and then do the cast before invoking these methods?

Also, what's the planned alternative to this? It looks like we're implicitly assuming that when we go to actually build a druid query, we are working with actual druid tables. Which makes sense until you factor in union datasources, unless union datasource are also considered ConcretePhysicalTables (haven't gotten that far).

Alright, I'm putting in a slightly more resilient variation. However, we should still tweak this in some way.

There's a little tech debt here. Existing union data sources will construct with either non-Concrete physical tables or sets of tables. Table data source will only be usable by concrete physical tables. This polymorphism will push up the stack, but I'd like to take it as tech debt right now.

Sound reasonable.

cdeszaq · 2017-02-10T21:43:11Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+/**
+ * The schema for a result set.
+ */
+public class ResultSetSchema extends BaseSchema implements Schema {


Doesn't need to declare that it implements Schema, since extends BaseSchema does that.

cdeszaq · 2017-02-10T21:44:50Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+import javax.validation.constraints.NotNull;
+
+/**
+ * The schema for a result set.


It would be good if we could expand a bit on what it means to be the schema of a result set. If we can't, that's fine, but I'd love to see this class comment be more useful / descriptive than just repeating the class name.

Things like expected use, meaning, and what it adds beyond (or how it differs from) a Schema or BaseSchema would all be good candidates for info to add here.

Let me know if the next commit is better.

cdeszaq · 2017-02-10T21:45:03Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+    /**
+     * The granularity of the ResultSet.
+     */
+    private Granularity granularity;


Should be final?

cdeszaq · 2017-02-10T21:55:57Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+     */
+    public ResultSetSchema(
+            @NotNull Granularity granularity, Iterable<Column> columns
+    ) {


cdeszaq · 2017-02-10T21:57:40Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+     *
+     * @param resultSetSchema the result set schema being copied
+     */
+    public ResultSetSchema(ResultSetSchema resultSetSchema) {


I don't think this is needed. Since ResultSetSchema is immutable, creating a new object with exactly the same contents isn't needed because it's exactly the same as simply using the 1st instance.

cdeszaq

Part 2

cdeszaq · 2017-02-14T19:41:46Z

fili-core/src/main/java/com/yahoo/bard/webservice/async/AsyncUtils.java

@@ -61,7 +61,9 @@ public static PreResponse buildErrorPreResponse(Throwable throwable) {
        }

        return new PreResponse(
-                new ResultSet(Collections.emptyList(), new Schema(AllGranularity.INSTANCE)),
+                new ResultSet(
+                        new ResultSetSchema(AllGranularity.INSTANCE, Collections.emptySet()), Collections.emptyList()


Param list should "chop" down if it needs to wrap. (Collections.emptyList() belongs on a new line)

cdeszaq · 2017-02-14T20:01:40Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+     *
+     * @return the result set being constructed
+     */
+    public ResultSetSchema withAddColumn(Column c) {


I realize it's not particularly confusing what c is, but could we get a better name for the param in the interest of self-documenting (if nothing else)?

cdeszaq · 2017-02-14T20:04:38Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/config/names/ApiMetricName.java

+    static ApiMetricName of(String name) {
+        return new ApiMetricName() {
+            @Override
+            public boolean isValidFor(final TimeGrain grain) {


final not needed

cdeszaq · 2017-02-14T20:05:41Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/config/names/TableName.java

+     *
+     * @param name the name being wrapped
+     *
+     * @return an anonymous subclass instance of ApiMetricName


Copy-paste error

cdeszaq · 2017-02-14T20:07:18Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/config/names/ApiMetricName.java

+            public String asName() {
+                return name;
+            }
+        };


Should this implement equals and hashcode?

cdeszaq · 2017-02-15T21:50:46Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/metric/MetricColumn.java

    /**
     * Constructor.
     *
     * @param name  The column name
     */
-    protected MetricColumn(String name) {
+    public MetricColumn(String name) {


Was this made public b/c the static "factory" thing went away?

cdeszaq · 2017-02-15T22:00:05Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/metric/mappers/RowNumMapper.java

-    protected Result map(Result result, Schema schema) {
-        return result;
+    protected Result map(Result result, ResultSetSchema schema) {
+        throw new UnsupportedOperationException("This code should never be reached.");


Why should it never be reached? It's not clear why this method is not supported.

Because it is potentially called only from the general map function, and it isn't here.

cdeszaq · 2017-02-15T22:03:37Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/metric/mappers/RowNumMapper.java

-        Result newResult;
+        ResultSetSchema schema = map(resultSet.getSchema());
+        MetricColumn column = schema.getColumn(ROW_NUM_COLUMN_NAME, MetricColumn.class).orElseThrow(
+                () -> new IllegalStateException("Unexpected missing column")


Perhaps we should say which column is missing?

And should this be in the ErrorMessage enum?

And we should probably log this as an error, since it's likely to kill the request I think?

I'm just going to remove this check. We literally just added this column to the schema the line before.

cdeszaq · 2017-02-16T18:48:04Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/metric/mappers/SketchRoundUpMapper.java

        if (columnName == null) {
            throw new IllegalStateException("Cannot map results without a column name");
        }

-        MetricColumn metricColumn = (MetricColumn) schema.getColumn(columnName);
+        MetricColumn metricColumn = schema.<MetricColumn>getColumn(columnName, MetricColumn.class).orElseThrow(
+                () -> new IllegalStateException("Unexpected missing column")


Perhaps we should say which column is missing?

And should this be in the ErrorMessage enum?

And we should probably log this as an error, since it's likely to kill the request I think?

Next commit will add capture for ResultSetMapper errors above where mapping is called.

I've push the error handling for this up into ResultSetResponseProcessor.

cdeszaq · 2017-02-16T19:00:46Z

fili-core/src/main/java/com/yahoo/bard/webservice/druid/model/datasource/TableDataSource.java

        super(DefaultDataSourceType.TABLE, Collections.singleton(physicalTable));

-        this.name = physicalTable.getFactTableName();
+        this.name = physicalTable.getAvailability().getDataSourceNames().stream().findFirst().get().asName();


To simplify this, and because ConcretePhyiscalTable can only have 1 backing table, could/should we push this logic into ConcretePhysicalTable and hide it behind a getTableName() method?

And actually, after looking a bit deeper, ConcretePhyiscalTable already / still has a getFactTableName method that does this logic, so we should use that and rely on the contract / concept of ConcretePhyiscalTable.

archolewa

A few minor quibbles and questions. In general though, it looks great.

archolewa · 2017-02-17T21:37:28Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

@@ -202,8 +203,9 @@ protected GroupByQuery buildGroupByQuery(
        if (!template.isNested()) {
            LOG.trace("Building a single pass druid groupBy query");

+            // TODO FIXME hardcoding to concrete for now


Do we have an issue for fixing this?

archolewa · 2017-02-17T21:41:05Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

        // The data source is the table directly, since there is no nested query below us
-        DataSource dataSource = new TableDataSource(table);
+        DataSource dataSource = new TableDataSource((ConcretePhysicalTable) table);


It looks like we are casting to a ConceretePhysicalTable in each of these building methods. Should we just update the signatures of tehse methods to take a ConcretePhysicalTable instead, and then do the cast before invoking these methods?

Also, what's the planned alternative to this? It looks like we're implicitly assuming that when we go to actually build a druid query, we are working with actual druid tables. Which makes sense until you factor in union datasources, unless union datasource are also considered ConcretePhysicalTables (haven't gotten that far).

archolewa · 2017-02-17T21:43:57Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidResponseParser.java

@@ -40,11 +39,17 @@
     * @param jsonResult  Druid results in json
     * @param schema  Schema for results
     * @param queryType  the type of query, note that this implementation only supports instances of
+     * @param dateTimeZone the time zone used for format the results
     * {@link DefaultQueryType}


This line goes with the parameter queryType, not dateTimeZone

archolewa · 2017-02-17T21:46:58Z

...ore/src/main/java/com/yahoo/bard/webservice/table/resolver/DefaultPhysicalTableResolver.java

@@ -2,11 +2,11 @@
 // Licensed under the terms of the Apache license. Please see LICENSE.md file distributed with this work for terms.
 package com.yahoo.bard.webservice.table.resolver;

+import com.yahoo.bard.webservice.table.PhysicalTable;


This should be kept where it was.

archolewa · 2017-02-17T21:50:38Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/PreResponseDeserializer.java

-                generateGranularity(schemaNode.get(SCHEMA_GRANULARITY).asText(), timezone),
-                timezone
-        );
+        Granularity granularity = generateGranularity(schemaNode.get(SCHEMA_GRANULARITY).asText(), timezone);


This doesn't seem to be used.

archolewa · 2017-02-17T21:54:29Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSet.java

     */
-    public ResultSet(List<Result> results, Schema schema) {
+    public ResultSet(ResultSetSchema schema, List<Result> results) {


Why are we flipping the order of parameters? Seems like an unnecessary backwards incompatible change.

because the old order was stupid

archolewa · 2017-02-17T22:18:35Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/ResultSetSchema.java

+/**
+ * The schema for a result set.
+ */
+public class ResultSetSchema extends BaseSchema implements Schema {


There is a bit of inconsistency here concerning whether or not a ResultSetSchema has an order. It takes as an argument a simple Set, but withAddColumn refers to a final column, which only makes sense if the Schema has an ordering.

So in other words, we have a behavior, where the original columns are unordered, until we add another column, in which case the new schema has a fixed (arbitrary) order such that the new columns added always appear last?

I think I remember us talking about something like this, where something took an unordered collection and spat out an ordered collection, and I remember being ok with it, but I don't remember if it was for this class or a different one. But either way, this "no fixed order until we add a new column then order" is kind of weird.

If we want this schema to have some kind of order, then we should probably take in a LinkedHashSet. Or try to explain that it imposes a (potentially arbitrary) order on the columns depending on how the passed in set's iterator happens to spit out the columns

I'm switching to an Iterable in LinkedHashSet out idiom
One of the libraries we use supports that and it makes a lot of sense because it unambiguously commuicates that we're set-ifying, and the Iterable makes it clear that we're concerned with order while being polymorphic across sources.

archolewa · 2017-02-28T22:06:25Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/resolver/PartialTimeComparator.java

@@ -2,9 +2,9 @@
 // Licensed under the terms of the Apache license. Please see LICENSE.md file distributed with this work for terms.
 package com.yahoo.bard.webservice.table.resolver;

+import com.yahoo.bard.webservice.table.PhysicalTable;


This should stay where it was.

archolewa · 2017-02-28T22:06:47Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/resolver/PhysicalTableResolver.java

 import com.yahoo.bard.webservice.table.PhysicalTable;
+import com.yahoo.bard.webservice.data.metric.TemplateDruidQuery;


archolewa · 2017-02-28T22:07:24Z

...-core/src/main/java/com/yahoo/bard/webservice/table/resolver/SchemaPhysicalTableMatcher.java

@@ -4,12 +4,12 @@

 import static com.yahoo.bard.webservice.web.ErrorMessageFormat.TABLE_SCHEMA_UNDEFINED;

+import com.yahoo.bard.webservice.table.PhysicalTable;


archolewa · 2017-03-06T16:45:20Z

👍

garyluoex · 2017-03-06T23:49:38Z

👍 The upcoming PRs will change many things in here

cdeszaq

Part A

cdeszaq · 2017-03-07T16:57:24Z

fili-core/src/main/java/com/yahoo/bard/webservice/async/AsyncUtils.java

+                new ResultSet(new ResultSetSchema(
+                        AllGranularity.INSTANCE,
+                        Collections.emptySet()),
+                        Collections.emptyList()),


/notablocker

Wrap the closing paren onto the next line

cdeszaq · 2017-03-07T16:57:40Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

@@ -35,6 +38,9 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;

+import avro.shaded.com.google.common.collect.Sets;


This is certainly not the right import. We shouldn't depend on a shadowed import in another library.

cdeszaq · 2017-03-07T17:03:41Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

            // The data source is the table directly, since there is no nested query below us
-            DataSource dataSource = new TableDataSource(table);
+            //DataSource dataSource = new TableDataSource((ConcretePhysicalTable) table);


Can we just delete this? If not, add a comment explaining why we're keeping makes sense.

cdeszaq · 2017-03-07T17:07:58Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

        // The data source is the table directly, since there is no nested query below us
-        DataSource dataSource = new TableDataSource(table);
+        DataSource dataSource = new TableDataSource((ConcretePhysicalTable) table);


Should this have the same "concrete if concrete otherwise Union" logic that the GroupBy builder method has?

Yep. Fix incoming.

cdeszaq · 2017-03-07T17:09:00Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

+        Set<TableName> names = table.getAvailability().getDataSourceNames();
+        if (names.isEmpty()) {
+            throw new IllegalStateException("Misconfigured table with no backing datasource.");
+        } else if (names.size() == 1 && table instanceof ConcretePhysicalTable) {


Why is the TimeseriesQueryBuild method yet a 3rd version of constructing the DataSource?

refactored in next checkin

cdeszaq · 2017-03-07T18:12:10Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+     *
+     * @param logicalName  Logical name to lookup in physical table
+     * @return Translated logicalName if applicable
+     */


/NotABlocker

JavaDoc not needed, since it's a duplicate of the interface

cdeszaq · 2017-03-07T18:12:52Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+    public Map<Column, List<Interval>> getAvailableIntervals() {
+        return getAvailability().getAvailableIntervals();
+    }
+    /**


Blank line needed.

cdeszaq · 2017-03-07T18:12:59Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+     *
+     * @return tableEntries map of column to set of available intervals
+     *
+     */


/NotABlocker

JavaDoc not needed, since it's a duplicate of the interface

cdeszaq · 2017-03-07T18:13:08Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+     * @return The time grain of this physical table
+     *
+     * @deprecated use getSchema().getGranularity()
+     */


/NotABlocker

JavaDoc not needed, since it's a duplicate of the interface

cdeszaq · 2017-03-07T18:14:08Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+     */
+    public void resetColumns(SegmentMetadata segmentMetadata, DimensionDictionary dimensionDictionary) {
+        Map<String, Set<Interval>> dimensionIntervals = segmentMetadata.getDimensionIntervals();
+        Map<String, Set<Interval>> metricIntervals = segmentMetadata.getMetricIntervals();


/NotABlocker

Can just inline thse

cdeszaq

Part B

cdeszaq · 2017-03-07T18:19:14Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BaseSchema.java

+
+    @Override
+    public LinkedHashSet<Column> getColumns() {
+        return columns;


This needs to be (somewhere, probably the constructor) made to be immutable (if possible).

Harder than it seems. There are no types that ARE immutable but also extend LinkedHashSet

Ahh, I see. That's dumb.

cdeszaq · 2017-03-07T18:20:09Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/ConcretePhysicalTable.java

+import javax.validation.constraints.NotNull;
+
+/**
+ * An instance of Physical table that is backed by a single fact table.


/NotABlocker

It's not an instance, but perhaps it's an implementation?

cdeszaq · 2017-03-07T18:21:38Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/ConcretePhysicalTable.java

+    /**
+     * Create a concrete physical table.
+     *
+     * @param name  Fili name of the physical table


This doc should make it more clear what the differences is between this and the other constructor, particularly around things like what values are used as defaults...

cdeszaq · 2017-03-07T18:29:04Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/LogicalTable.java

+     * Getter.
+     *
+     * @return The granularity
+     */


/NotABlocker

This javadoc isn't needed, since this is just a getter.

cdeszaq · 2017-03-07T18:32:20Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTable.java


 /**
- * Physical Table represents a druid table.
+ * An interface describing a config level physical table.


It's odd that this interface references config. Isn't it a very core interface, and used by lots more than config?

It's also a bit self-referential... Perhaps we cast it in a FactSourceRepresentation light?

cdeszaq · 2017-03-07T19:00:00Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+/**
+ * An availability which guarantees immutability on its contents.
+ */
+//public class ImmutableAvailability extends ImmutableWrapperMap<Column, List<Interval>> implements Availability {


cdeszaq · 2017-03-07T19:01:59Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+//public class ImmutableAvailability extends ImmutableWrapperMap<Column, List<Interval>> implements Availability {
+public class ImmutableAvailability implements Availability {
+
+    private final TableName name;


/NotABlocker

Technically, because you're holding an interface here which itself has no immutability assertion, this class cannot ensure immutability.

cdeszaq · 2017-03-07T19:02:55Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+
+    @Override
+    public Set<TableName> getDataSourceNames() {
+        return Sets.newHashSet(name);


Can we do this conversion in the constructor?

cdeszaq · 2017-03-07T19:03:18Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+
+    @Override
+    public int hashCode() {
+        return columnIntervals.hashCode();


Don't we need to include the TableName?

Still needs addressing?

cdeszaq · 2017-03-07T19:03:45Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+        if (this == obj) {
+            return true;
+        }
+        if (obj instanceof ImmutableAvailability) {


This doesn't seem to pay attention to TableName...

cdeszaq

Reached end of changes.

cdeszaq · 2017-03-07T23:03:08Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

+        result.getDimensionRow((DimensionColumn) schema.columns.toArray()[0])?.get(BardDimensionField.DESC) == "u"
+        result.getDimensionRow((DimensionColumn) schema.columns.toArray()[2])?.get(BardDimensionField.DESC) == "4"
+        result.getDimensionRow((DimensionColumn) schema.columns.toArray()[3])?.get(BardDimensionField.DESC) == ""
+        result.getDimensionRow((DimensionColumn) schema.columns.toArray()[3])?.get(BardDimensionField.ID) == "foo"


Why are these changing ordinal and order?

I have no idea. @garyluoex , did you change these?

cdeszaq · 2017-03-07T23:03:20Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

-        resultWithNullDimensionKey.getDimensionRow(schema.columns.toArray()[0])?.get(BardDimensionField.ID) == ""
-        resultWithNullDimensionKey.getDimensionRow(schema.columns.toArray()[0])?.get(BardDimensionField.DESC) ==
+        resultWithNullDimensionKey.getDimensionRow((DimensionColumn) schema.columns.toArray()[2])?.get(BardDimensionField.ID) == ""
+        resultWithNullDimensionKey.getDimensionRow((DimensionColumn) schema.columns.toArray()[2])?.get(BardDimensionField.DESC) ==


Why the ordinal change?

Note: This seems to apply to many other places in this file as well.

cdeszaq · 2017-03-08T15:42:58Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

-        MetricColumn.addNewMetricColumn(schema, "lookback_pageViews")
-        MetricColumn.addNewMetricColumn(schema, "retentionPageViews")
-        ResultSet resultSet = new DruidResponseParser().parse(jsonResult, schema, DefaultQueryType.LOOKBACK)
+        ResultSetSchema schema = new ResultSetSchema(DAY, [new MetricColumn("pageViews"), new MetricColumn("lookback_pageViews"), new MetricColumn("retentionPageViews")].toSet())


/NotABlocker

Rather than calling .toSet(), it's more idiomatic to call as Set in Groovy.

_{Note: Applies to other places in this PR as well.}

cdeszaq · 2017-03-08T15:43:55Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy


        then:
        thrown(UnsupportedOperationException)

    }
-
-    def "Build the schema from the query"() {


To where did this test move?

Just moved it here: ResultSetResponseProcessorSpec (next commit)

cdeszaq · 2017-03-08T15:44:50Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

        }
+
+        Schema schema = new ResultSetSchema(DAY, columns)


/NotABlocker

Can just return the new ResultSetSchema(...) directly. No need to assign to temp var

cdeszaq · 2017-03-08T16:05:48Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/QueryBuildingTestingResources.groovy

+
+        Map<Column, List<Interval>> availabilityMap1 = [:]
+        Map<Column, List<Interval>> availabilityMap2 = [:]
+        Map<Column, List<Interval>> availabilityMap3 = [:]


Can we give these better names? "1", "2", "3" isn't very clear about what these will hold.

cdeszaq · 2017-03-08T16:09:56Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/SerializationResources.groovy

-        responseContext.put("missingIntervals", ["a","b","c", new SimplifiedIntervalList([interval]), bigDecimal])
+        responseContext.put(
+                "missingIntervals",
+                ["a", "b", "c", new SimplifiedIntervalList([interval]), bigDecimal] as ArrayList


Why do we need to coerce this to an ArrayList?

Arraylist is serializable

cdeszaq · 2017-03-08T16:10:44Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/SerializationResources.groovy

        }
+        ResultSetSchema schema = new ResultSetSchema(granularity, columns)
        return schema


/NotABlocker

Just return it directly?

cdeszaq · 2017-03-08T16:14:51Z

...test/groovy/com/yahoo/bard/webservice/druid/model/aggregation/FilteredAggregationSpec.groovy

-                ScanSearchProviderManager.getInstance("gender")
-        )
+        def filtered_metric_name = "FOO_NO_BAR"
+        Set<ApiMetricName> metricNames = (["FOO", filtered_metric_name].collect() { ApiMetricName.of(it)}) as Set


/NotABlocker

empty parens are not needed on collect here

cdeszaq · 2017-03-08T16:30:09Z

...roovy/com/yahoo/bard/webservice/web/responseprocessors/ResultSetResponseProcessorSpec.groovy

 import com.yahoo.bard.webservice.web.DataApiRequest
 import com.yahoo.bard.webservice.web.ResponseFormatType

 import com.fasterxml.jackson.databind.JsonNode

 import org.joda.time.DateTimeZone

+import avro.shaded.com.google.common.collect.Sets


Wrong import

Making Schema very immutable. Changing column build pattern as a result.

cdeszaq

Just a few final things. Nothing big.

cdeszaq · 2017-03-13T15:23:39Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java

@@ -219,37 +241,19 @@ protected GroupByQuery buildGroupByQuery(
            );
        }


Indentation is wrong here, and I think there's a regression where, in a nested case, filters are also being applied at the outer query, when they were no before. I don't think this is correctness issue, but it's definitely an optimality issue in terms of query size, as well as work Druid is doing for filtering.

Wait, never mind. I see what's happening. I missed the fact that filters get set to null when building the inner query. We should definitely add a comment on that, and probably toss a comment on the outer query filter location, so that it's obvious what had happened (it's easy to miss).

cdeszaq · 2017-03-13T15:24:57Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/DruidQueryBuilder.java


-            // The data source is the table directly, since there is no nested query below us
-            DataSource dataSource = new TableDataSource(table);
+            filter = null;


Should definitely add a comment here to call this out.

cdeszaq · 2017-03-13T15:35:09Z

fili-core/src/main/java/com/yahoo/bard/webservice/data/config/table/BaseTableLoader.java

-     *
-     * @return The logical table built
-     */
-    public LogicalTable buildLogicalTable(


Since these were public methods, are we sure we can / want to just remove them? I'm pretty sure other code bases were relying on them for configuration, so if we could deprecate them instead, that would be better.

cdeszaq · 2017-03-13T15:41:22Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/BasePhysicalTable.java

+        ));
+    }
+
+    public void setAvailability(Availability availability) {


Should this be protected?

cdeszaq · 2017-03-13T15:45:41Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTable.java

-        return super.toString() + " factTableName: " + factTableName + " alignment: " + getTableAlignment();
-    }
+    /**
+     * Get the time grain from the physical table.


This javadoc doesn't match what the method seems to be doing, based on it's name.

cdeszaq · 2017-03-13T15:50:03Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/PhysicalTable.java

-        return physicalToLogicalColumnNames.getOrDefault(physicalName, Collections.singleton(physicalName));
-    }
+    @Deprecated
+    Set<Column> getColumns();


For all of these deprecated methods, they can be made default and the implementations can be lifted / removed from the BasePhysicalTable up into this interface. Right now, there's a mix of some of the deprecated methods being defaulted (but still implemented), some deprecated, but their implementations pushed into the base class, etc.

cdeszaq · 2017-03-13T15:54:22Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+    public ImmutableAvailability(TableName tableName, Map<Column, List<Interval>> map) {
+        this.name = tableName;
+        columnIntervals = ImmutableMap.copyOf(map);
+        dataSourceNames = Sets.newHashSet(name);


This should make it into an immutable set (either ImmutableSet.of(name) or something using Collections)

cdeszaq · 2017-03-13T16:00:17Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

-        resultSet[0].getMetricValueAsNumber(schema.getColumn("pageViews")) == 1 as BigDecimal
-        resultSet[0].getMetricValueAsNumber(schema.getColumn("time_spent")) == 2 as BigDecimal
+        Result firstResult = resultSet.get(0)
+        List<Column> columns = new ArrayList<>(schema.columns)


cdeszaq · 2017-03-13T16:01:24Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

+    Set<Column> dimensionColumns
+    Column ageColumn
+    Column genderColumn
+    Column unknownColumn


These should all be DimensionColumn-typed

cdeszaq · 2017-03-13T16:01:53Z

fili-core/src/test/groovy/com/yahoo/bard/webservice/data/DruidResponseParserSpec.groovy

@@ -43,13 +36,17 @@ class DruidResponseParserSpec extends Specification {

    @Shared DimensionDictionary dimensionDictionary


/NotABlockerBecauseYouDidn'tChangeIt

It would be great to make this not be shared so that we can't leak state across tests. (or at least not as easily)

cdeszaq · 2017-03-13T16:27:57Z

fili-core/src/main/java/com/yahoo/bard/webservice/table/availability/ImmutableAvailability.java

+
+import org.joda.time.Interval;
+
+import avro.shaded.com.google.common.collect.Sets;


Wrong import

cdeszaq

👍 pending getting the build to pass.

Javadoc tightening Stream simplification Test variable renames

QubitPi reviewed Feb 8, 2017

View reviewed changes

michael-mclawhorn added the WIP label Feb 8, 2017

michael-mclawhorn force-pushed the RefactorTable branch from f9045c4 to 4620d8c Compare February 9, 2017 23:48

michael-mclawhorn added NEED 2 REVIEWS REVIEWABLE and removed WIP labels Feb 9, 2017

garyluoex requested changes Feb 13, 2017

View reviewed changes

cdeszaq requested changes Feb 14, 2017

View reviewed changes

garyluoex mentioned this pull request Feb 14, 2017

1. Implement QueryPlanning and DataSource Constraints into Resolvers and Matchers #169

Merged

cdeszaq requested changes Feb 16, 2017

View reviewed changes

michael-mclawhorn force-pushed the RefactorTable branch from 71e52d5 to 7e024ba Compare February 27, 2017 23:56

archolewa requested changes Feb 28, 2017

View reviewed changes

michael-mclawhorn force-pushed the RefactorTable branch from 7e024ba to fce490b Compare March 2, 2017 17:11

archolewa approved these changes Mar 6, 2017

View reviewed changes

michael-mclawhorn force-pushed the RefactorTable branch from c4c01bb to 21ac864 Compare March 6, 2017 16:51

garyluoex approved these changes Mar 6, 2017

View reviewed changes

cdeszaq requested changes Mar 7, 2017

View reviewed changes

cdeszaq requested changes Mar 8, 2017

View reviewed changes

michael-mclawhorn added 4 commits March 10, 2017 11:22

Design notes

f033fab

Making Schema a property of tables.

320c0fa

Making Schema very immutable. Changing column build pattern as a result.

Zoning some time.

f903d46

Updating references to schemas and tables.

c80e580

michael-mclawhorn force-pushed the RefactorTable branch from 9822ebf to 6384e6b Compare March 10, 2017 17:40

cdeszaq requested changes Mar 13, 2017

View reviewed changes

cdeszaq reviewed Mar 13, 2017

View reviewed changes

cdeszaq approved these changes Mar 13, 2017

View reviewed changes

CHANGELOG

5772ead

Javadoc tightening Stream simplification Test variable renames

michael-mclawhorn force-pushed the RefactorTable branch from 7f2e498 to 5772ead Compare March 13, 2017 21:07

michael-mclawhorn merged commit 5772ead into master Mar 13, 2017

michael-mclawhorn deleted the RefactorTable branch March 13, 2017 22:37

		@@ -219,8 +219,7 @@ private ResultSet getResultSet(JsonNode serializedResultSet) {
		*
		* @return ZonedSchema object generated from the JsonNode

		@@ -229,21 +228,25 @@ private ZonedSchema getZonedSchema(JsonNode schemaNode) {
		);

		//Recreate ZonedSchema from granularity and timezone values

		import com.yahoo.bard.webservice.table.PhysicalTable;
		import com.yahoo.bard.webservice.data.metric.TemplateDruidQuery;

		@@ -4,12 +4,12 @@

		import static com.yahoo.bard.webservice.web.ErrorMessageFormat.TABLE_SCHEMA_UNDEFINED;

		import com.yahoo.bard.webservice.table.PhysicalTable;

		@@ -219,37 +241,19 @@ protected GroupByQuery buildGroupByQuery(
		);
		}

		@@ -43,13 +36,17 @@ class DruidResponseParserSpec extends Specification {

		@Shared DimensionDictionary dimensionDictionary


		import org.joda.time.Interval;

		import avro.shaded.com.google.common.collect.Sets;

Refactor to decouple availability from table schemas #165

Refactor to decouple availability from table schemas #165

Conversation

michael-mclawhorn commented Feb 8, 2017

archolewa commented Feb 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michael-mclawhorn commented Feb 9, 2017

michael-mclawhorn commented Feb 9, 2017

garyluoex left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garyluoex Feb 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdeszaq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdeszaq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

archolewa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

archolewa commented Mar 6, 2017

garyluoex commented Mar 6, 2017

cdeszaq left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

garyluoex Feb 10, 2017 •

edited

Loading