Make dbt-sql and default-sql templates public #1463

lennartkats-db · 2024-06-02T11:11:40Z

Changes

This makes the dbt-sql and default-sql templates public.

These templates were previously not listed and marked "experimental" since structured streaming tables were still in gated preview and would result in weird error messages when a workspace wasn't enabled for the preview.

This PR also incorporates some of the feedback and learnings for these templates so far.

libs/template/materialize.go

codecov-commenter · 2024-06-03T07:11:22Z

Codecov Report

Attention: Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.

Project coverage is 53.57%. Comparing base (e22dd8a) to head (2deaa71).
Report is 128 commits behind head on main.

Files	Patch %	Lines
libs/template/materialize.go	0.00%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1463      +/-   ##
==========================================
+ Coverage   52.25%   53.57%   +1.31%     
==========================================
  Files         317      351      +34     
  Lines       18004    20271    +2267     
==========================================
+ Hits         9408    10860    +1452     
- Misses       7903     8612     +709     
- Partials      693      799     +106

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pietern · 2024-06-03T09:32:03Z

...mplate/templates/dbt-sql/template/{{.project_name}}/resources/{{.project_name}}_job.yml.tmpl

-            - 'dbt run --target=${bundle.target} --vars "{ dev_schema: {{$dev_schema}} }"'
+{{- if (regexp "^yes").MatchString .personal_schemas}}
+            - 'dbt seed --target=${bundle.target} --vars "{ dev_schema: ${workspace.current_user.short_name} }"'
+            - 'dbt run --target=${bundle.target} --vars "{ dev_schema: ${workspace.current_user.short_name} }"'


This will also apply to the prod target, which contradicts the explanation in the template:

yes, use a schema based on the current user name during development

You could use target overrides to only apply this for the dev target.

This should actually work as intended since the "dev_schema" var is only used in the dev profile: https://github.com/databricks/cli/pull/1463/files#diff-0ca9390ffa1ff34b28194f7947d477dd79b02f9fe6aa14a578483469913e4d9aR21.

This is the place where our dev/prod targets and the dbt profiles come together. Which can look a bit weird. To make the logic of the generated yml seem bit simpler, I'm no longer passing "dev_schema" unless yes, use a schema based on the current user name during development is actually enabled. Once that is enabled, customers will have to stare at the slightly more complicated ymkl.

It is possible to do this exclusively for the dev target using overrides. Then it is more clear the dev schema only pertains to the dev target. It's pretty confusing as is, IMO.

targets: dev: resources: jobs: dbt_project_job: tasks: - task_key: dbt dbt_task: commands: - 'dbt deps --target=${bundle.target}' - 'dbt seed --target=${bundle.target} --vars "{ dev_schema: ${workspace.current_user.short_name} }"' - 'dbt run --target=${bundle.target} --vars "{ dev_schema: ${workspace.current_user.short_name} }"' prod: resources: jobs: dbt_project_job: tasks: - task_key: dbt dbt_task: commands: - 'dbt deps --target=${bundle.target}' - 'dbt seed --target=${bundle.target}' - 'dbt run --target=${bundle.target}'

The top level resource then shouldn't specify commands, as lists are appended in overrides.

Or define another variable that interpolates the inner string for --vars.

The challenge with overrides is that it duplicates things and breaks visual editing. And in this case, we're just passing the value of a parameter, not changing behavior. So I'd rather just keep a single version of this job.

How about a comment then? The generated configuration should pre-empt questions people have. Maybe "the dev schema is not used in the prod target" or something?

Yes that sounds good, added!

libs/template/materialize.go

pietern · 2024-06-03T14:25:52Z

Right now the workspace host is displayed as if it were a variable. The intent is for this to be human-readable right?

lennartkats-db · 2024-06-03T15:04:04Z

@pietern

Right now the workspace host is displayed as if it were a variable. The intent is for this to be human-readable right?

I'm not sure what you mean? It doesn't look like a variable on your screenshot? Do you mean that it looks a bit like an input prompt? That is intentional to make all settings appear similar.

pietern · 2024-06-03T15:19:12Z

I would expect it to read like prose, like the preceding text.

The template targets workspace https://...

As opposed to a key: value.

lennartkats-db · 2024-06-04T07:43:11Z

@pietern yeah, it totally could. My mental model has been that this is just another setting but one that is set indirectly. That seemed like the easiest way to explain it. Any suggestions for phrasing if we present that in a different way?

I tried

Welcome to the dbt template for Databricks Asset Bundles!

Workspace selected based on your current profile: https://e2-dogfood.staging.cloud.databricks.com
(See https://docs.databricks.com/dev-tools/cli/profiles.html for how to change this).

Please provide a unique name for this project.
project_name [dbt_project]:

But the text just seems a bit out of place

pietern · 2024-06-04T08:28:03Z

...mplate/templates/dbt-sql/template/{{.project_name}}/resources/{{.project_name}}_job.yml.tmpl

+            # The dbt commands to run (see also the dev/prod profiles in dbt_profiles/profiles.yml)
+            - 'dbt deps --target=${bundle.target}'
+            - 'dbt seed'
+            - 'dbt run'


Is the target not required after running deps?

Ah yes, it is. Good catch. As you can see it is used in the other branch of the {{if}}.

CLI: * Update OpenAPI spec ([#1466](#1466)). Bundles: * Upgrade TF provider to 1.46.0 ([#1460](#1460)). * Add support for Lakehouse monitoring ([#1307](#1307)). * Make dbt-sql and default-sql templates public ([#1463](#1463)). Internal: * Abstract over filesystem interaction with libs/vfs ([#1452](#1452)). * Add `filer.Filer` to read notebooks from WSFS without omitting their extension ([#1457](#1457)). * Fix listing notebooks in a subdirectory ([#1468](#1468)). API Changes: * Changed `databricks account storage-credentials list` command to return . * Added `databricks consumer-listings batch-get` command. * Added `databricks consumer-providers batch-get` command. * Removed `databricks apps create-deployment` command. * Added `databricks apps deploy` command. OpenAPI commit 37b925eba37dfb3d7e05b6ba2d458454ce62d3a0 (2024-06-03) Dependency updates: * Bump github.com/hashicorp/go-version from 1.6.0 to 1.7.0 ([#1454](#1454)). * Bump github.com/hashicorp/hc-install from 0.6.4 to 0.7.0 ([#1453](#1453)).

## Changes This fixes a last-minute regression that snuck into #1463: unfortunately we need to use `USE IDENTIFIER('schema')` to select a schema for now. In the future we expect we can just use `USE SCHEMA 'schema'`.

lennartkats-db added 3 commits May 27, 2024 15:36

Make default-sql and dbt-sql templates public

0d8775e

Refine template based on multi-statement support

55f55a5

Simplify schema selection

3d1c2f5

lennartkats-db requested review from pietern and shreyas-goenka June 2, 2024 11:11

Cleanup

e684e88

lennartkats-db commented Jun 2, 2024

View reviewed changes

libs/template/materialize.go Show resolved Hide resolved

Fix tests

7afc0cd

pietern reviewed Jun 3, 2024

View reviewed changes

Move workspace_host to welcome message

33d6edf

Adding a comment

df7968f

pietern approved these changes Jun 4, 2024

View reviewed changes

lennartkats-db added 2 commits June 4, 2024 10:50

Fix run command

ccd9bf8

Merge remote-tracking branch 'databricks/main' into enable-sql-templates

2deaa71

lennartkats-db enabled auto-merge June 4, 2024 08:52

lennartkats-db added this pull request to the merge queue Jun 4, 2024

Merged via the queue into databricks:main with commit aa36aee Jun 4, 2024
5 checks passed

lennartkats-db deleted the enable-sql-templates branch June 4, 2024 09:05

pietern mentioned this pull request Jun 4, 2024

Release v0.221.0 #1470

Merged

lennartkats-db mentioned this pull request Jun 4, 2024

Fix SQL schema selection in default-sql template #1471

Merged

lennartkats-db mentioned this pull request Jun 4, 2024

[DONTMERGE] Refine experimental SQL template based on multi-statement support #1289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make dbt-sql and default-sql templates public #1463

Make dbt-sql and default-sql templates public #1463

lennartkats-db commented Jun 2, 2024

codecov-commenter commented Jun 3, 2024 •

edited

Loading

pietern Jun 3, 2024

lennartkats-db Jun 3, 2024

pietern Jun 3, 2024

pietern Jun 3, 2024

lennartkats-db Jun 3, 2024

pietern Jun 3, 2024

lennartkats-db Jun 4, 2024

pietern commented Jun 3, 2024

lennartkats-db commented Jun 3, 2024

pietern commented Jun 3, 2024

lennartkats-db commented Jun 4, 2024

pietern Jun 4, 2024

lennartkats-db Jun 4, 2024

Make dbt-sql and default-sql templates public #1463

Make dbt-sql and default-sql templates public #1463

Conversation

lennartkats-db commented Jun 2, 2024

Changes

codecov-commenter commented Jun 3, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pietern commented Jun 3, 2024

lennartkats-db commented Jun 3, 2024

pietern commented Jun 3, 2024

lennartkats-db commented Jun 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jun 3, 2024 •

edited

Loading