-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: add warning on duplicated yaml keys #5146
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Integration test from days of yonder fails led me to a yaml test file (https://github.com/dbt-labs/dbt-core/blob/main/test/integration/006_simple_dependency_tests/local_dependency/models/schema.yml) with actual duplicated top level keys - it looks like the "mapping node should be a set" 1 so curious if that particular test worked properly before 🤔 Looks like as a byproduct of this, previously when the project had multiple duplicated keys: ...
models:
- name: my_model_a
columns:
- name: id
tests:
- unique
models:
- name: my_model_b
columns:
- name: id
tests:
- unique A Footnotes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few comments about structure that I wanted to get back to you on. I'll take a look at the failing legacy tests and see what's happening there next.
core/dbt/clients/yaml_helper.py
Outdated
msg = f"{e} {path.searched_path}/{path.relative_path}." | ||
dbt.exceptions.warn_or_error(msg, log_fmt=warning_tag("{}")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use warn_or_raise
here instead of warn_or_error
. That way the exception you throw above bubbles up instead of getting overlaid by the CompilationException
by warn_or_error
. To get the message passed through as DuplicateYamlKeyException
you can modify the message.
e.msg = f"{e} {path.searched_path}/{path.relative_path}."
dbt.exceptions.warn_or_raise(e, log_fmt=warning_tag("{}"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new changes I see:
$ dbt run
10:59:42 Running with dbt=1.2.0-a1
10:59:42 [WARNING]: Compilation Error
Compilation Error
Duplicate 'models' key found in yaml file models/schema.yml.
10:59:43 Found 2 models, 0 tests, 0 snapshots, 0 analyses, 167 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
10:59:43
10:59:43 Concurrency: 1 threads (target='dev')
...
Wondering if that was what you expected here?
@@ -4,7 +4,6 @@ sources: | |||
schema: invalid_schema | |||
tables: | |||
- name: my_table | |||
sources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emmyoop surprisingly fixing the duplicate top level keys in this yaml file in the legacy tests causes it to pass so now really curious what this was doing previously... makes me wonder if we had left 2 sources
keys in this file by mistake the last time around but test was passing due to pyaml default of only reading the last source
key of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing anything in the tests that indicate we did it on purpose so you may be right. These tests are going to be rewritten and we'll work through some of the weirdness in them as we go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
* add warning on duplicated yaml keys * update structure and tests * fix old test schema file * add changelog
resolves #5114
Description
When we have a duplicate top level key in our schema yml files, we should see the following warning:
In
core/dbt/exceptions.py
,DuplicateYamlKeyException
is a no-op ofCompilationException
- is this okay?For the tests, because we warn instead of error, I'm using
run_dbt_and_capture()
fromdbt.tests.utils
and asserting that stdout has the right string - not too sure if this is the right way to do this. I also tried something along the lines of:Which didn't quite work because we print a warning instead of raising an error. (Ftr, I just cribbed what was in
dbt/tests/functional/basic/tests_basic.py
for this one).Checklist
changie new
to create a changelog entry