Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow MetricsControl to aggregate on a column with an expression #5021

Merged

Conversation

michellethomas
Copy link
Contributor

In the new MetricsControl, selecting a column with an expression would fail because it's just using the column name which doesn't exist in the underlying table (only in our metadata). This would add the expression (if there is one) inside the aggregation.

@john-bodley @GabeLoins @mistercrunch

@gabe-lyons
Copy link

@michellethomas do we need to do something like this for druid as well?

@michellethomas
Copy link
Contributor Author

I'm not totally sure if you can use a dimension spec (column expression) within a postagg in druid. When I looked through the druid metrics code I didn't see anywhere we were using DruidColumn expression or dimension_spec properties, so it doesn't look like we are doing this now. But I'm not totally sure.

Maybe John or Max know?

@codecov-io
Copy link

codecov-io commented May 16, 2018

Codecov Report

Merging #5021 into master will increase coverage by 0.1%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #5021     +/-   ##
=========================================
+ Coverage   77.35%   77.46%   +0.1%     
=========================================
  Files          44       44             
  Lines        8677     8683      +6     
=========================================
+ Hits         6712     6726     +14     
+ Misses       1965     1957      -8
Impacted Files Coverage Δ
superset/connectors/sqla/models.py 78.22% <100%> (+1.8%) ⬆️
superset/data/__init__.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce0011e...f1db6f9. Read the comment docs.

@mistercrunch
Copy link
Member

I'm not 100% but I'm pretty sure Druid postaggs have to be made out of metrics.

Your PR LGTM, I'd be great to have a unit test that was failing before and succeeding now. For example we could have a new expression in the "births" dataset that would be called first_letter as substr(name, 1, 1) and do a count distinct of that or something.

@michellethomas
Copy link
Contributor Author

michellethomas commented May 16, 2018

@mistercrunch I didn't see a lot of tests for this section of the code. Did you have thoughts on where to add tests for this? It looks like I could fit it into viz_tests.TableVizTestCase.

@mistercrunch
Copy link
Member

All examples get executed as integration test, so if you add a new example chart that cover this use case it should fail before your PR, succeed after

@michellethomas michellethomas force-pushed the adhoc_metrics_column_expression branch from 79c2155 to f1db6f9 Compare May 21, 2018 22:29
@michellethomas
Copy link
Contributor Author

@mistercrunch added tests, thanks for the suggestion!

@mistercrunch
Copy link
Member

LGTM

@john-bodley
Copy link
Member

Note I believe this only supports one level of indirection, i.e., I'm not certain whether one can create another dimension which leverages num_california in your example.

@john-bodley
Copy link
Member

@mistercrunch what are your thoughts about having metrics reference other metrics, or should metric expressions be valid SQL?

@john-bodley john-bodley merged commit b8aeb1a into apache:master May 22, 2018
@mistercrunch
Copy link
Member

mistercrunch commented May 22, 2018

Valid SQL was always the assumption, for simplicity and clarity. MetricsControl kind of changes the model a bit though as one might expect to see computed dimensions in there (which is what this fixes). I don't think people would expect to see MetricsControl-generated metrics inside the dropdown though, meaning there should be only one level.

michellethomas added a commit to michellethomas/panoramix that referenced this pull request May 22, 2018
…che#5021)

* Allow MetricsControl to aggregate on a column with an expression

* Adding test case for metrics based on columns

(cherry picked from commit b8aeb1a)
michellethomas added a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
…che#5021)

* Allow MetricsControl to aggregate on a column with an expression

* Adding test case for metrics based on columns
timifasubaa pushed a commit to timifasubaa/incubator-superset that referenced this pull request May 31, 2018
…che#5021)

* Allow MetricsControl to aggregate on a column with an expression

* Adding test case for metrics based on columns
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
…che#5021)

* Allow MetricsControl to aggregate on a column with an expression

* Adding test case for metrics based on columns
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.26.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants