-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate_model_yaml doesn't recognize nested BigQuery columns #27
Comments
@jtcohen6 — subbing you in as my favorite BigQuery expert! Do you agree that the expected behavior should be to create one column per key? Or do you think this would be nice as an optional flag? |
The current behavior of the dbt Docs is to include both the top-level field and each of the nested subfields: I agree with @RBozydar that I'd expect codegen to create one YML column entry for each of FWIW I think this is an enhancement, not a bug :) cf dbt-labs/dbt-core#2549 |
I experienced this issue myself today, wanting to generate the model yaml for a model with nested bigquery columns. @jtcohen6, since the macro here simply calls the |
@bodschut Fair point! Care to open a new issue in the |
Actually @jtcohen6 I took a deeper look and you can access nested fields in the result of |
Hi All I have a fix for this issue, basically the generate_model_yaml.sql macro file should look like this: {% macro generate_column_yaml(column, model_yaml, parent_column_name="") %}
{% if parent_column_name %}
{% set column_name = parent_column_name ~ "." ~ column.name %}
{% else %}
{% set column_name = column.name %}
{% endif %}
{% do model_yaml.append(' - name: ' ~ column_name | lower ) %}
{% do model_yaml.append(' description: ""') %}
{% do model_yaml.append('') %}
{% if column.fields|length > 0 %}
{% for child_column in column.fields %}
{% set model_yaml = generate_column_yaml(child_column, model_yaml, parent_column_name=column_name) %}
{% endfor %}
{% endif %}
{% do return(model_yaml) %}
{% endmacro %}
{% macro generate_model_yaml(model_name) %}
{% set model_yaml=[] %}
{% do model_yaml.append('version: 2') %}
{% do model_yaml.append('') %}
{% do model_yaml.append('models:') %}
{% do model_yaml.append(' - name: ' ~ model_name | lower) %}
{% do model_yaml.append(' description: ""') %}
{% do model_yaml.append(' columns:') %}
{% set relation=ref(model_name) %}
{%- set columns = adapter.get_columns_in_relation(relation) -%}
{% for column in columns %}
{% set model_yaml = generate_column_yaml(column, model_yaml) %}
{% endfor %}
{% if execute %}
{% set joined = model_yaml | join ('\n') %}
{{ log(joined, info=True) }}
{% do return(joined) %}
{% endif %}
{% endmacro %} Essentially, I've split out the logic to generate the yaml for a column and used it recursively. I've tested it on a model with multiple nesting levels and it works. I used the output to add descriptions on nested columns and they are applied perfectly in bigquery. |
@jtcohen6 any chance we can incorporate this? :-) |
@bodschut Nice work! It sounds like we can accomplish the fix with a change to code in this repo only. Would you be up to open a PR? :) |
@jtcohen6 pull request is created. Can you help me to see if there are extra integration tests needed? |
Describe the bug
generate_model_yaml doesn't recognize nested bigquery columns
Steps to reproduce
Create a model with nested columns, either through STRUCT() or ARRAY(), for example:
Expected results
The macro should print out the nested columns as well, now it prints out only the "main" column/field.
With the example above we should get
source, medium, source_medium and analytics as columns in the yaml output
Actual results
The macro prints out only the "main" column/field which is analytics
System information
The contents of your
packages.yml
file:Which database are you using dbt with?
The output of
dbt --version
:The operating system you're using:
Ubuntu 18.04
The output of
python --version
:Python 3.8.1
Are you interested in contributing the fix?
I would love to, unfortunately my knowledge of Jinja isn't sufficient so I'd appreciate some pointers
The text was updated successfully, but these errors were encountered: