Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2676] [Bug] persist_docs with columns starting with a number errors #7832

Closed
2 tasks done
tnorlund-dsc opened this issue Jun 9, 2023 · 6 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@tnorlund-dsc
Copy link

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When I create a table with the persist_docs in the config, I get an error when running the model.

{{
    config(
        schema="stg",
        materialized="incremental",
        unique_key="id",
        persist_docs={"relation": true, "columns": true},
    )
}}

select
...
    sponsored_brands."14_day_total_orders",
...
from
    {{ source("source", "table") }} sponsored_brands
{% if is_incremental() %}
...
{% endif %}

With the logs showing:

00:18:26.539504 [debug] [Thread-1  ]: On model.DSC_DATAMART.cln_ra__sponsored_brands: /* {"app": "dbt", "dbt_version": "1.4.1", "profile_name": "<profile>", "target_name": "dev", "node_id": "model.<database>.cln_ra__sponsored_brands"} */
...
    comment on column "dm"."<schema>"."stg__cln_ra__sponsored_brands".14_day_total_orders is $dbt_comment_literal_block$$dbt_comment_literal_block$;
...
00:18:27.083668 [debug] [Thread-1  ]: Postgres adapter: Postgres error: syntax error at or near ".14"
LINE 56: ...m"."..."."stg__cln_ra__sponsored_brands".14_day_to...

Expected Behavior

I should be able to persist_docs in Redshift and have columns that are named using ASCI characters.

Steps To Reproduce

While using Redshift, use a config that has persists_docs and add column descriptions for columns that have names that start with a number.

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

redshift

Additional Context

No response

@tnorlund-dsc tnorlund-dsc added bug Something isn't working triage labels Jun 9, 2023
@github-actions github-actions bot changed the title [Bug] persist_docs with columns starting with a number errors [CT-2676] [Bug] persist_docs with columns starting with a number errors Jun 9, 2023
@tnorlund-dsc
Copy link
Author

When I dbt run --select model_that_has_persist_docs after removing the persist_docs from the config, it runs fine. Please let me know if there are any other logs/files I can share.

@dbeatty10
Copy link
Contributor

Thanks for reporting this @tnorlund-dsc !

I'm having trouble reproducing this on my end. Could you try out the following and let me know if it does or doesn't work for you? If it does work without issues, can you try tweaking it to get it to reproduce the error you are seeing?

models/my_model.sql

{{
    config(
        materialized="table",
    )
}}

select 1 as "14_day_total_orders"

models/model_that_has_persist_docs.sql

{{
    config(
        materialized="table",
        persist_docs={"relation": true, "columns": true},
    )
}}

select * from {{ ref("my_model") }}
dbt run --select my_model
dbt run --select model_that_has_persist_docs

@tnorlund-dsc
Copy link
Author

I'm able to repro the problem when the model with persist_docs has a yaml file with column comments on it.

In my example, sponsored_brands.sql is a model that has a column named 14_day_total_orders where persist_docs is in the config. The yaml file also has this model and describes the column with a description.

I don't see a yaml file in your example.

The logs show the run is failing after the table is created. I'm guessing the run fails when the comments are set:

00:18:26.515388 [debug] [Thread-1  ]: SQL status: SELECT in 0 seconds
00:18:26.537642 [debug] [Thread-1  ]: Using redshift connection "model.<...>.cln_ra__sponsored_brands"
00:18:26.539504 [debug] [Thread-1  ]: On model.<...>.cln_ra__sponsored_brands: /* {"app": "dbt", "dbt_version": "1.4.1", "profile_name": "<profile>", "target_name": "dev", "node_id": "model.<...>.cln_ra__sponsored_brands"} */
...
    comment on column "dm"."<my development env>"."stg__cln_ra__sponsored_brands".14_day_total_orders is $dbt_comment_literal_block$$dbt_comment_literal_block$;
...
00:18:27.083668 [debug] [Thread-1  ]: Postgres adapter: Postgres error: syntax error at or near ".14"
LINE 56: ...m"."<my development env>"."stg__cln_ra__sponsored_brands".14_day_to...
                                                                     ^

00:18:27.088392 [debug] [Thread-1  ]: On model.<>.cln_ra__sponsored_brands: ROLLBACK

My yaml file is standard:

version: 2

models:
  - name: cln_ra__sponsored_brands
    description: Traffic data for the brand page. This data is at a date-level and describes media spend and traffic associated with the brand page.
    columns:
...
      - name: 14_day_total_orders
        description: Total number of orders that were purchased through the ad over a 14 day period

I can get the run to complete with a different column name. It fails with the column description and without it.

@dbeatty10
Copy link
Contributor

I don't see a yaml file in your example.

Yep, this was it -- thanks for pointing this out @tnorlund-dsc 🧠. See below for a full reproducible example.

Reprex

models/my_model.sql

{{
    config(
        materialized="table",
    )
}}

select 1 as "14_day_total_orders"

models/model_that_has_persist_docs.sql

{{
    config(
        materialized="table",
        persist_docs={"relation": true, "columns": true},
    )
}}

select * from {{ ref("my_model") }}

models/_models.yml

version: 2

models:
  - name: model_that_has_persist_docs
    columns:
      - name: 14_day_total_orders
        description: Total number of orders that were purchased through the ad over a 14 day period
dbt run --select my_model
dbt run --select model_that_has_persist_docs

@dbeatty10 dbeatty10 removed the triage label Jun 13, 2023
@dbeatty10
Copy link
Contributor

@tnorlund-dsc Sorry for the churn here, but I just realized we actually have a quote configuration for model columns to handle columns that need quoting.

In the example above, you'd specify that the column needs to be quoted like this:

version: 2

models:
  - name: model_that_has_persist_docs
    columns:
      - name: 14_day_total_orders
        quote: true
        description: Total number of orders that were purchased through the ad over a 14 day period

Could you give this a shot?

@tnorlund-dsc
Copy link
Author

That fixed it! Thanks @dbeatty10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants