[CT-2429] `dbt show`: option to use already-materialized model instead of rerunning SQL #7391

jtcohen6 · 2023-04-18T11:56:29Z

As a dbt developer, especially on BigQuery, I want to preview my just-built table in the fastest & cheapest way possible.

When users run dbt show --select my_model, rather than recompiling and executing that model's SQL, give users the option to preview the already-materialized version of my_model in the data warehouse.

This goes beyond just --inline "select * from {{ ref('that_model') }}", because it would also unlock adapter-specific optimizations.

On all data platforms, this saves compute
On BigQuery, this saves $$$.

Options:

Config, --from-materialized (I hate this name)
Different command: dbt head, to match bq head?

In the short term, it would be the responsibility of users to track logical/applied state, and do their own "cache invalidation" if they've edited the model's definition since they last built it. (In the longer term, we could get cleverer about this in stateful runtime environments.)

This would be functionally identical for views and ephemeral models.

The text was updated successfully, but these errors were encountered:

b-per · 2023-04-18T12:02:17Z

Throwing other options for the config name:

--existing
--last-refreshed
--materialized
--last-materialized

jtcohen6 · 2023-04-20T13:27:50Z

In theory, this would also work for Python models — build first, then preview.

Another option is to have this command/config implicitly materialize the model first: run it, then show it. I'd rather the building be explicit, though — and then provide the option of showing the just-built thing.

github-actions · 2024-01-17T01:47:45Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

b-per · 2024-01-17T10:05:20Z

I'd love for this to happen! This could be very useful to save costs when querying data and avoid having people to go to their Data Warehouse interface to do so.

dbeatty10 · 2024-06-10T17:38:02Z

New config vs. different command

Of the two options, I prefer a different command (like described in #10265).

e.g. I greatly prefer this:

dbt head --select my_model

Rather than:

dbt show --select my_model --head

Advantages

Creating a new command has the advantage of being twice as easy to type out. It would also handle situations that dbt show doesn't currently:

Python models
sources
snapshots
incremental models
etc.

Trade-offs

Unless we did some fancy footwork, it wouldn't work in any cases where dbt doesn't materialize a database object though:

ephemeral models
analyses
etc.

e.g., if we just do a direct select * from {{ ref('that_model') }}, then new command (or CLI flag) would work IFF the node type "exists in database". But we can imagine adding some logic to handle the case where there won't be an object in the database (because it is ephemeral, etc.) by just doing the same thing as the current dbt show in those specific cases.

pransito · 2024-08-17T08:47:56Z

I think this is a very good idea to allow dbt show to do more

dbt show with these flags would be great!

also these flags would be helpful:

--to-file "my_output.parquet" --file-output-format "parquet"

maxmue · 2024-08-29T14:15:01Z

It's off-topic, but how do I make dbt show (latest cloud cli) show all columns instead of omitting many while just showing ...?

dbeatty10 · 2024-08-29T15:08:34Z

@maxmue to see all the columns, I use the --output json option which will output a JSON format of one line per column (rather than one line per row).

$ dbt show -s table_x --output json
14:58:28  Running with dbt=1.8.5
14:58:31  Registered adapter: duckdb=1.8.3
14:58:31  Found 4 models, 410 macros, 1 unit test
14:58:31  
14:58:32  Concurrency: 1 threads (target='dev')
14:58:32  
14:58:32  {
  "node": "table_x",
  "show": [
    {
      "starship": "USS Voyager",
      "quote": "There's coffee in that nebula"
    }
  ]
}

If you're hoping for some way to keep the same tabular format but without the ..., please open a feature request here.

dbeatty10 · 2024-08-29T15:13:58Z

@pransito

also these flags would be helpful:

--to-file "my_output.parquet" --file-output-format "parquet"

In general, this would need to be a separate feature request.

But I can say ahead of time that we're not planning on exporting the results of dbt show into different file formats like Parquet, CSV, TSV, etc.

jtcohen6 added enhancement New feature or request Team:Execution Team:Adapters Issues designated for the adapter area of the code labels Apr 18, 2023

github-actions bot changed the title ~~dbt show: option to use already-materialized model instead of rerunning SQL~~ [CT-2429] dbt show: option to use already-materialized model instead of rerunning SQL Apr 18, 2023

jtcohen6 mentioned this issue Apr 18, 2023

[CT-2428] dbt show should include --limit in compiled query #7390

Closed

jtcohen6 mentioned this issue Apr 20, 2023

[CT-2373][Bug] dbt show doesn't work for seeds #7273

Closed

jtcohen6 added the Refinement Maintainer input needed label Apr 24, 2023

jtcohen6 removed the Team:Execution label Jul 19, 2023

jtcohen6 mentioned this issue Aug 22, 2023

[CT-3017] [Feature] Support show of python model #8467

Open

3 tasks

github-actions bot added the stale Issues that have gone stale label Jan 17, 2024

github-actions bot removed the stale Issues that have gone stale label Jan 18, 2024

mikealfare removed the Team:Adapters Issues designated for the adapter area of the code label Feb 13, 2024

dbeatty10 mentioned this issue Jun 10, 2024

[Feature] Add a head command to see the first rows of a table #10265

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-2429] `dbt show`: option to use already-materialized model instead of rerunning SQL #7391

[CT-2429] `dbt show`: option to use already-materialized model instead of rerunning SQL #7391

jtcohen6 commented Apr 18, 2023 •

edited by dbeatty10

Loading

b-per commented Apr 18, 2023

jtcohen6 commented Apr 20, 2023

github-actions bot commented Jan 17, 2024

b-per commented Jan 17, 2024

dbeatty10 commented Jun 10, 2024

pransito commented Aug 17, 2024

maxmue commented Aug 29, 2024

dbeatty10 commented Aug 29, 2024

dbeatty10 commented Aug 29, 2024

[CT-2429] dbt show: option to use already-materialized model instead of rerunning SQL #7391

[CT-2429] dbt show: option to use already-materialized model instead of rerunning SQL #7391

Comments

jtcohen6 commented Apr 18, 2023 • edited by dbeatty10 Loading

b-per commented Apr 18, 2023

jtcohen6 commented Apr 20, 2023

github-actions bot commented Jan 17, 2024

b-per commented Jan 17, 2024

dbeatty10 commented Jun 10, 2024

New config vs. different command

Advantages

Trade-offs

pransito commented Aug 17, 2024

maxmue commented Aug 29, 2024

dbeatty10 commented Aug 29, 2024

dbeatty10 commented Aug 29, 2024

[CT-2429] `dbt show`: option to use already-materialized model instead of rerunning SQL #7391

[CT-2429] `dbt show`: option to use already-materialized model instead of rerunning SQL #7391

jtcohen6 commented Apr 18, 2023 •

edited by dbeatty10

Loading