Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1584] New top level commands: interactive compile #7008

Merged
merged 18 commits into from
Mar 11, 2023
Merged

Conversation

aranke
Copy link
Member

@aranke aranke commented Feb 19, 2023

resolves #6358

Description

Checklist

@cla-bot cla-bot bot added the cla:yes label Feb 19, 2023
@github-actions
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@aranke aranke marked this pull request as ready for review March 7, 2023 01:01
@aranke aranke requested a review from a team March 7, 2023 01:01
@aranke aranke requested review from a team as code owners March 7, 2023 01:01
@aranke aranke requested review from ChenyuLInx and emmyoop March 7, 2023 01:01
@aranke aranke requested review from stu-k and peterallenwebb and removed request for emmyoop March 7, 2023 03:25
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for traversing through the codebase and add this feature!

I did run into one issue that running dbt compile -s customers(I have a model named that), compiled result was not logged.

The rest of the comments are mostly informational/confirm things.

@@ -42,6 +43,7 @@
"fail_fast",
"indirect_selection",
"store_failures",
"introspect",
Copy link
Contributor

@ChenyuLInx ChenyuLInx Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we want to add introspect here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it to help a test pass, but it might not be required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: removing this doesn't help, so I'll leave it as-is for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean removing it would cause things to fail? This doesn't feel right given that introspect is only a parameter for compile task command, and all parameter in this list is defiled both as cli click group level and a task command level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Punted into a new ticket: #7156

@@ -29,6 +29,7 @@
"FULL_REFRESH": False,
"STRICT_MODE": False,
"STORE_FAILURES": False,
"INTROSPECT": True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to figure out a way to remove this section at some point. Just write down what I think, this works before we do that.

@@ -1600,7 +1600,16 @@ message ConcurrencyLineMsg {
ConcurrencyLine data = 2;
}

// Skipped Q028
// Q028
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what is Q stand for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node execution, it's not the perfect category, but the best one I could find.

@@ -309,7 +310,7 @@ def skip_result(self, node, message):

def compile_and_execute(self, manifest, ctx):
result = None
with self.adapter.connection_for(self.node):
with self.adapter.connection_for(self.node) if get_flags().INTROSPECT else nullcontext():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user set DBT_INTROSPECT to False, but we are not running compile/preview command, get_flags().INTROSPECT will be True since it is using default value right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct.

core/dbt/task/compile.py Outdated Show resolved Hide resolved
Copy link
Contributor

@stu-k stu-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the exception of those relative imports, nothing stands out 👍

@aranke aranke requested a review from ChenyuLInx March 8, 2023 22:58
return "Q028"

def message(self) -> str:
return f"Compiled node '{self.node_name}' is:\n{self.compiled}"
Copy link
Member Author

@aranke aranke Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbeatty10 Does this message make sense, or should I change it to something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example what this message would look like with realistic values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup:

❯ dbt compile --inline "select * from {{ ref('raw_orders') }}"
...
17:50:59  Compiled node 'inline_query' is:
select * from "jaffle_shop"."main"."raw_orders"
17:50:59  Done.
❯ dbt compile --select stg_payments
...
17:51:27  Compiled node 'stg_payments' is:
with source as (
    select * from "jaffle_shop"."main"."raw_payments"

),

renamed as (

    select
        id as payment_id,
        order_id,
        payment_method,

        -- `amount` is currently stored in cents, so we convert it to dollars
        amount / 100 as amount

    from source

)

select * from renamed
17:51:27  Done.

Comment on lines +164 to +168
def file_exists(*paths):
"""Check if file exists at path"""
return os.path.exists(os.path.join(*paths))


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem to solve in this PR, but I wonder why os.path was used in this module rather than pathlib?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already using os in several places instead of pathlib so wanted to keep the status quo to avoid bringing in a new import.
But I don't feel too strongly either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, going with the flow make sense in this PR.

Something for us to consider in a refactor down the road though.

@@ -43,19 +43,32 @@ def raise_on_first_error(self):
return True

def get_node_selector(self) -> ResourceTypeSelector:
if getattr(self.args, "inline", None):
resource_types = [NodeType.SqlOperation]
elif getattr(self.args, "select", None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a behavior change, we can do it in task_end_message if that change is not acceptable.(I don't know whether it is or not)

if getattr(self.args, "inline", None):
resource_types = [NodeType.SqlOperation]
elif getattr(self.args, "select", None):
resource_types = [NodeType.Model]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline yesterday with @aranke @ChenyuLInx:

This is to handle the fact that --select my_model will also select the tests on my_model, and we really only want to be compiling one node. The right way to handle this behavior is to turn off "indirect selection," and the cleanest approach there is probably to introduce a new option to turn off indirect selection entirely (--indirect-selection empty).

Depending on how much work is involved there, we can scope & estimate that as a separate issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new selection method empty to only select a node that matches the selector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did y'all consider any alternative names to empty? For example, none feels more intuitive to me, but maybe y'all considered it already.

Adding empty to the current options would give:

  1. eager (default) - include ANY test that references the selected nodes
  2. cautious - restrict to tests that ONLY refer to selected nodes
  3. buildable - restrict to tests that ONLY refer to selected nodes (or their ancestors)
  4. empty - don't include any tests

If it is named something like none, then the list would be this instead:

  1. eager
  2. cautious
  3. buildable
  4. none

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't name a Click option none without some trickery, since that's a built-in Python type 😞
I anticipate Cloud to be the only people who use this, so didn't give it too much thought.

@aranke aranke requested a review from ChenyuLInx March 10, 2023 23:04
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good over all! Last two questions

def test_select_pass(self, project):
(results, log_output) = run_dbt_and_capture(["compile", "--select", "second_model"])
assert len(results) == 3
assert "Compiled node 'second_model' is:" in log_output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in this case we would't print the compiled result for second model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with @jtcohen6 in the office, we should print this out even if we did compile it.
I've included the filtering behavior you mentioned to only print out exact matches for nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then --indirect-selection empty is only to make sure we don't compile test etc? makes sense to me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's to only compile the node that is selected and nothing else.

@@ -42,6 +43,7 @@
"fail_fast",
"indirect_selection",
"store_failures",
"introspect",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean removing it would cause things to fail? This doesn't feel right given that introspect is only a parameter for compile task command, and all parameter in this list is defiled both as cli click group level and a task command level.

Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one question

core/dbt/task/compile.py Outdated Show resolved Hide resolved
@aranke aranke merged commit 3050369 into main Mar 11, 2023
@aranke aranke deleted the interactive_compile branch March 11, 2023 01:58
@dbeatty10
Copy link
Contributor

@aranke Do we have an issue opened in the docs.getdbt.com repo for this new feature?

If not, then we'd want to fill out the template here.

We'd want to add the dbt-core v1.5 label so that it gets tracked with the other 1.5 features that need documentation.

peterallenwebb pushed a commit that referenced this pull request Mar 14, 2023
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
@aranke
Copy link
Member Author

aranke commented Mar 14, 2023

@dbeatty10 done, thank you!
dbt-labs/docs.getdbt.com#2994

@dbeatty10
Copy link
Contributor

@dbeatty10 done, thank you! dbt-labs/docs.getdbt.com#2994

Sure thing @aranke !

Could you also add the dbt-core v1.5 label to dbt-labs/docs.getdbt.com#2994 ?

I just updated the PR checklist so it's easy to see the linked issue and its status:

image

peterallenwebb added a commit that referenced this pull request Mar 22, 2023
* ct-2198: clean up some type names and uses

* CT-2198: Unify constraints and constraints_check properties on columns

* Make mypy version consistently 0.981 (#7134)

* CT 1808 diff based partial parsing (#6873)

* model contracts on models materialized as views (#7120)

* first pass

* rename tests

* fix failing test

* changelog

* fix functional test

* Update core/dbt/parser/base.py

* Update core/dbt/parser/schemas.py

* Create method for env var deprecation (#7086)

* update to allow adapters to change model name resolution in py models (#7115)

* update to allow adapters to change model name resolution in py models

* add changie

* fix newline adds

* move quoting into macro

* use single quotes

* add env DBT_PROJECT_DIR support #6078 (#6659)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>

* Add new index.html and changelog yaml files from dbt-docs (#7141)

* Make version configs optional (#7060)

* [CT-1584] New top level commands: interactive compile (#7008)

Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>

* CT-2198: Add changelog entry

* CT-2198: Fix tests which broke after merge

* CT-2198: Add explicit validation of constraint types w/ unit test

* CT-2198: Move access property, per code review

* CT-2198: Remove a redundant macro

* CT-1298: Rework constraints to be adapter-generated in Python code

* CT-2198: Clarify function name per review

---------

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Stu Kilgore <stu.kilgore@dbtlabs.com>
Co-authored-by: colin-rogers-dbt <111200756+colin-rogers-dbt@users.noreply.github.com>
Co-authored-by: Leo Schick <67712864+leo-schick@users.noreply.github.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: FishtownBuildBot <77737458+FishtownBuildBot@users.noreply.github.com>
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
Co-authored-by: Github Build Bot <buildbot@fishtownanalytics.com>
QMalcolm added a commit that referenced this pull request Apr 8, 2024
Over a year ago in [#7008](https://github.com/dbt-labs/dbt-core/pull/7008/files#diff-c10753a61a1d8f973ec3206206d72b3c5200e41673ab2c6b256f333103eadb8f)
the tests in `test_compiler.py` got moved into other testing files. Since
then this file has been sitting around empty. It is probably a reasonable
time to say goodbye to it.
QMalcolm added a commit that referenced this pull request Apr 10, 2024
#9878)

* Delete empty `test_compiler.py` file

Over a year ago in [#7008](https://github.com/dbt-labs/dbt-core/pull/7008/files#diff-c10753a61a1d8f973ec3206206d72b3c5200e41673ab2c6b256f333103eadb8f)
the tests in `test_compiler.py` got moved into other testing files. Since
then this file has been sitting around empty. It is probably a reasonable
time to say goodbye to it.

* Delete empty `searcher.py` file

The file `searcher.py` was added, as an empty file, four years ago in
[#2312](https://github.com/dbt-labs/dbt-core/pull/2312/files#diff-c8e9e62cf63356f44fe1a2b0272b239d7a650c57911477ed4549b15ee3de16fa).
Since then it has never been touched or added to. I'm not sure of the
initial purpose of the file as it never contained anything and I have not
found any documentation in reference to it. It's safe to say, it can go
away.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1584] [Feature] New top level commands: interactive compile
6 participants