feat: query datasets from SQL Lab #15241

betodealmeida · 2021-06-17T22:17:02Z

SUMMARY

Introduce the dataset macro, allowing users to query physical and virtual datasets in SQL Lab. This is useful if you've defined computed columns and metrics on your datasets, and want to reuse the definition in adhoc SQL Lab queries.

Because currently dataset names are not unique the macro uses the dataset ID:

SELECT * FROM {{ dataset(42) }} LIMIT 10

If users want to select the metric definitions as well, in addition to the columns, they can pass an additional keyword argument:

SELECT * FROM {{ dataset(42, include_metrics=True) }} LIMIT 10

Since metrics are aggregations, the resulting SQL expression will be grouped by all non-metric columns. Users can also specify a subset of columns to group by instead:

SELECT * FROM {{ dataset(42, include_metrics=True, groupby=["ds", "category"]) }} LIMIT 10

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Added unit tests.

ADDITIONAL INFORMATION

Has associated issue:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

superset/jinja_context.py

codecov · 2021-06-17T22:33:50Z

Codecov Report

Merging #15241 (ed43b79) into master (6244728) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #15241      +/-   ##
==========================================
+ Coverage   66.46%   66.51%   +0.04%     
==========================================
  Files        1721     1724       +3     
  Lines       64466    64647     +181     
  Branches     6794     6794              
==========================================
+ Hits        42846    42997     +151     
- Misses      19892    19922      +30     
  Partials     1728     1728

Flag	Coverage Δ
hive	`53.68% <16.66%> (-0.01%)`	⬇️
mysql	`?`
postgres	`82.27% <100.00%> (+0.08%)`	⬆️
presto	`53.54% <16.66%> (-0.01%)`	⬇️
python	`82.62% <100.00%> (+<0.01%)`	⬆️
sqlite	`?`
unit	`50.00% <100.00%> (+0.60%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/connectors/base/models.py	`86.74% <ø> (ø)`
superset/jinja_context.py	`90.82% <100.00%> (+0.53%)`	⬆️
superset/datasets/filters.py	`86.95% <0.00%> (-13.05%)`	⬇️
superset/common/utils/dataframe_utils.py	`85.71% <0.00%> (-7.15%)`	⬇️
superset/db_engine_specs/sqlite.py	`91.89% <0.00%> (-5.41%)`	⬇️
superset/db_engine_specs/mysql.py	`93.97% <0.00%> (-3.62%)`	⬇️
superset/utils/celery.py	`86.20% <0.00%> (-3.45%)`	⬇️
superset/connectors/sqla/utils.py	`88.75% <0.00%> (-2.50%)`	⬇️
superset/result_set.py	`96.85% <0.00%> (-1.58%)`	⬇️
superset/models/core.py	`88.43% <0.00%> (-0.73%)`	⬇️
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6244728...ed43b79. Read the comment docs.

rumbin · 2021-09-30T21:30:23Z

Any progress in this?
I wonder to what extent this functionality requires the correct database/schema of the dataset to be selected in order to make it work.
Especially when two datasets are to be joined this way, they need to refer to the same database.
Is this being checked?

Dreaming this functionality a bit further, I could imagine having a dataset browser included in the schema browser, which presents all datasets of the selected database, so they can be selected from like the existing tables/views of the DB.

stale · 2022-04-16T14:56:23Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

rumbin · 2022-04-16T16:11:43Z

This issue should not be closed by the stale bot, in my eyes. The suggested functionality is just too promising.

betodealmeida · 2022-04-16T16:40:24Z

I agree, @rumbin, let me do some more work on this.

docs/docs/installation/sql-templating.mdx

shenrie · 2022-05-21T00:55:23Z

Want, want want....take my money!

villebro

Really cool feature, great way to leverage the SQLA model! Left a couple non-blocking comments, LGTM.

superset/jinja_context.py

villebro · 2022-05-26T11:30:00Z

superset/jinja_context.py

+        "columns": columns,
+        "groupby": groupby,


Since groupby is deprecated in QueryObject and just gets mapped into columns, maybe we could just do this:

Suggested change

"columns": columns,

"groupby": groupby,

"columns": groupby or columns,

I believe it has the same effect

villebro · 2022-05-26T13:48:52Z

superset/jinja_context.py

+def dataset_macro(
+    dataset_id: int,
+    include_metrics: bool = False,
+    groupby: Optional[List[str]] = None,


actually I was thinking, should we replace groupby with columns here? It feels it kinda makes more sense, as you should be able to pick a subset of columns from the dataset even when you don't have metrics with a GROUP BY

Will do, great idea!

* feat: Jinja2 macro for querying datasets * Add docs * Address comments

pull-request-size bot added the size/M label Jun 17, 2021

betodealmeida commented Jun 17, 2021

View reviewed changes

superset/jinja_context.py Outdated Show resolved Hide resolved

stale bot added the inactive Inactive for >= 30 days label Apr 16, 2022

stale bot removed the inactive Inactive for >= 30 days label Apr 16, 2022

betodealmeida force-pushed the dataset_macro_sqllab branch 2 times, most recently from d567e51 to 3cbdda6 Compare May 17, 2022 21:45

pull-request-size bot added size/L and removed size/M labels May 17, 2022

feat: Jinja2 macro for querying datasets

570e7c0

betodealmeida force-pushed the dataset_macro_sqllab branch from 3cbdda6 to 570e7c0 Compare May 17, 2022 21:52

Add docs

2fff8e8

betodealmeida commented May 18, 2022

View reviewed changes

docs/docs/installation/sql-templating.mdx Outdated Show resolved Hide resolved

villebro approved these changes May 26, 2022

View reviewed changes

villebro reviewed May 26, 2022

View reviewed changes

Address comments

ed43b79

betodealmeida force-pushed the dataset_macro_sqllab branch from 98f2854 to ed43b79 Compare June 1, 2022 20:39

betodealmeida merged commit 05a138a into apache:master Jun 1, 2022

philipher29 pushed a commit to ValtechMobility/superset that referenced this pull request Jun 9, 2022

feat: query datasets from SQL Lab (apache#15241)

97c14e3

* feat: Jinja2 macro for querying datasets * Add docs * Address comments

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: query datasets from SQL Lab #15241

feat: query datasets from SQL Lab #15241

betodealmeida commented Jun 17, 2021 •

edited

Loading

codecov bot commented Jun 17, 2021 •

edited

Loading

rumbin commented Sep 30, 2021

stale bot commented Apr 16, 2022

rumbin commented Apr 16, 2022

betodealmeida commented Apr 16, 2022

shenrie commented May 21, 2022

villebro left a comment

villebro May 26, 2022 •

edited

Loading

villebro May 26, 2022

betodealmeida Jun 1, 2022

	"columns": columns,
	"groupby": groupby,
	"columns": groupby or columns,

feat: query datasets from SQL Lab #15241

feat: query datasets from SQL Lab #15241

Conversation

betodealmeida commented Jun 17, 2021 • edited Loading

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented Jun 17, 2021 • edited Loading

Codecov Report

rumbin commented Sep 30, 2021

stale bot commented Apr 16, 2022

rumbin commented Apr 16, 2022

betodealmeida commented Apr 16, 2022

shenrie commented May 21, 2022

villebro left a comment

Choose a reason for hiding this comment

villebro May 26, 2022 • edited Loading

Choose a reason for hiding this comment

villebro May 26, 2022

Choose a reason for hiding this comment

betodealmeida Jun 1, 2022

Choose a reason for hiding this comment

betodealmeida commented Jun 17, 2021 •

edited

Loading

codecov bot commented Jun 17, 2021 •

edited

Loading

villebro May 26, 2022 •

edited

Loading