Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema yaml rendering and source overrides #2357

Merged
merged 15 commits into from
Apr 29, 2020

Conversation

beckjake
Copy link
Contributor

@beckjake beckjake commented Apr 24, 2020

resolves #2269
resolves #2283
resolves #2287

Description

This PR accomplishes a few related tasks:

  • All of schema.yml is now rendered at parse time (except tests and descriptions)
    • split the config rendering logic up for the different files it handles
  • The docs context now has access to everything schema.yml does
  • Dependency sources can be overridden in schema.yml files by adding the key overrides
    • To accommodate this, some structural changes:
      • re-ordered source parsing. both sources and patches are stored during parsing, and source tests are not rendered until after parsing is complete.
      • This means source tests (and less importantly, sources themselves) can't participate in partial parsing
      • Split the manifest into nodes and sources
      • Sources have more meaningful FQNs now
    • Snapshot freshness overrides do the same merge that source tables do to sources, everything else clobbers
  • Schema.yml, profiles.yml, and dbt_project.yml values that are rendered are all now rendered in the native env.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Apr 24, 2020
@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch from 00515f9 to bd48904 Compare April 24, 2020 17:25

def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
Make sure you don't call this until you're done with building your
manifest!
"""
# TODO: we could supply sources in nodes here, for backwards
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to do this - interested in feedback on it!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to break the existing contract here - using graph.nodes is pretty niche and we can call it out in the release notes / migration guide. I feel great about requiring users to go through graph.sources instead of graph.nodes, as many of the nodes in graph.nodes are going to be poorly constructed at parse-time.

Check out the big disclaimer here for more info: https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/graph/

@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch 2 times, most recently from afbbf59 to 6199871 Compare April 24, 2020 17:39
@beckjake beckjake requested a review from drewbanin April 24, 2020 17:45
@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch from aa93347 to 52cf845 Compare April 24, 2020 20:21
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great! Couple of pieces of feedback from playing around with this locally:


Can we add a warning message if a source patch overrides a source which does not exist? This was actually user I encountered while testing, and it would have been really helpful to see a warning.


I see a stack trace when trying to run dbt source snapshot-freshness:

User Error


I'll keep testing this, but wanted to send this initial feedback through sooner rather than later. Overall this is very, very good!!

CHANGELOG.md Show resolved Hide resolved

def build_flat_graph(self):
"""This attribute is used in context.common by each node, so we want to
only build it once and avoid any concurrency issues around it.
Make sure you don't call this until you're done with building your
manifest!
"""
# TODO: we could supply sources in nodes here, for backwards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to break the existing contract here - using graph.nodes is pretty niche and we can call it out in the release notes / migration guide. I feel great about requiring users to go through graph.sources instead of graph.nodes, as many of the nodes in graph.nodes are going to be poorly constructed at parse-time.

Check out the big disclaimer here for more info: https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/graph/

self.flat_graph = {
'nodes': {
k: v.to_dict(omit_none=False) for k, v in self.nodes.items()
},
'sources': {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

core/dbt/parser/schemas.py Show resolved Hide resolved
core/dbt/parser/sources.py Show resolved Hide resolved
core/dbt/task/runnable.py Show resolved Hide resolved
core/dbt/exceptions.py Show resolved Hide resolved
@beckjake
Copy link
Contributor Author

Can we add a warning message if a source patch overrides a source which does not exist? This was actually user I encountered while testing, and it would have been really helpful to see a warning.

Yes, definitely, great idea

@drewbanin
Copy link
Contributor

I think we've lost sources in the catalog. Can you update the get_catalog impls to also pull source schemas out of the manifest now that they're not reflected in graph.nodes anymore?

Jacob Beck added 11 commits April 28, 2020 11:28
The SchemaParser now renders the entire schema.yml except tests/descriptions
Snapshot config resolution
add override tests
fix some unit tests
Add/modify tests to test native rendering/config application
Handle quoting in render() instead of get_rendered code
 - avoid stripping the quotes from a string that was returned as jinja
@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch from 54c9bc4 to ed2fb08 Compare April 28, 2020 17:29
Jacob Beck added 2 commits April 28, 2020 11:37
- add paths to source patches, apply them to parsed source definitions
- warn on unused source patches
- fix error on duplicate source patches
- add sources back into the catalog
- update changelog
@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch from ed2fb08 to d383441 Compare April 28, 2020 17:37
@beckjake
Copy link
Contributor Author

@drewbanin I believe I've addressed all the PR feedback.

@beckjake beckjake force-pushed the feature/schema-yml-rendering-3 branch from c228f5e to 0b82def Compare April 28, 2020 18:49
Copy link
Contributor

@drewbanin drewbanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - ship it!

@beckjake beckjake merged commit 5884f7d into dev/octavius-catto Apr 29, 2020
@beckjake beckjake deleted the feature/schema-yml-rendering-3 branch April 29, 2020 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants