Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relation name '*__dbt_tmp' is longer than 63 characters #2869

Closed
1 of 5 tasks
moltar opened this issue Nov 6, 2020 · 9 comments · Fixed by #4921
Closed
1 of 5 tasks

Relation name '*__dbt_tmp' is longer than 63 characters #2869

moltar opened this issue Nov 6, 2020 · 9 comments · Fixed by #4921
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! postgres Team:Adapters Issues designated for the adapter area of the code
Milestone

Comments

@moltar
Copy link

moltar commented Nov 6, 2020

Describe the bug

My view name is pretty long, but it is less than 63 characters.

When dbt adds the __dbt_tmp suffix, it goes over the limit of 63 chars.

The following error is thrown by PostgreSQL:

Relation name 'foo__dbt_tmp' is longer than 63 characters

Steps To Reproduce

  1. Use a view name that is 63 characters long.
  2. dbt run

Expected behavior

To work anyways.

Screenshots and log output

See above.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

Using Docker image fishtownanalytics/dbt:0.18.1.

The operating system you're using:

Using Docker image fishtownanalytics/dbt:0.18.1.

The output of python --version:

Using Docker image fishtownanalytics/dbt:0.18.1.

Additional context

Perhaps the temporary name can be truncated by the length of the suffix to make sure it fits within the limits?

@moltar moltar added bug Something isn't working triage labels Nov 6, 2020
@jtcohen6
Copy link
Contributor

jtcohen6 commented Nov 8, 2020

Thanks for opening, @moltar! I agree, this is peskier than it could be.

Following the discussion about this over in #2850, the real max character length of a model name on Postgres is currently 51, because of the appended suffix __dbt_backup in the table and view materializations.

I think we could do as you recommend, and truncate the model name (if >51 characters) to accommodate the suffix. I view that as a reasonable next step on top of the work in #2850, which handles the truncation of uniquely suffixed identifiers in the incremental materialization.

@jtcohen6 jtcohen6 added good_first_issue Straightforward + self-contained changes, good for new contributors! redshift and removed triage labels Nov 8, 2020
@moltar
Copy link
Author

moltar commented Nov 8, 2020

Thank you.

I see this tagged as redshift/pg.

I'd like to note that Redshift names seem to allow names of 127 bytes long.

https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_NEW.html

@ghost
Copy link

ghost commented Dec 7, 2020

With snapshots the suffix is even longer, e.g. dbt_tmp20201206072656412865. If you add _snapshot to the relation name, not much characters are left to use for the relation name.
Please also consider snapshots when addressing this issue. Thanks:)

@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 7, 2020

Heard @Marinto. Like the incremental materialization, the snapshot materialization already uses make_temp_relation, which was addressed in #2850. The question in this issue is how to best cross-apply a similar improvement to the view + table materializations, which do not use make_temp_relation.

@ghost
Copy link

ghost commented Dec 7, 2020

@jtcohen6 ah I see, I wasn't aware of #2850. This is exactly what I was looking for 👍

@danielefrigo
Copy link
Contributor

Thanks for opening, @moltar! I agree, this is peskier than it could be.

Following the discussion about this over in #2850, the real max character length of a model name on Postgres is currently 51, because of the appended suffix __dbt_backup in the table and view materializations.

I think we could do as you recommend, and truncate the model name (if >51 characters) to accommodate the suffix. I view that as a reasonable next step on top of the work in #2850, which handles the truncation of uniquely suffixed identifiers in the incremental materialization.

Regarding the proposal of truncating the model name, I see a potential overlapping issue: you could have 2 models running in parallel, with a long common prefix in the name. If we truncate the model name, the tmp tables created by dbt could end up having exactly the same name.
An alternative could be to completely loose the name link between the model name and the temp table name, using some kind of unique short identifier (e.g. an hash of the model name).

@jtcohen6
Copy link
Contributor

@danielefrigo A hash of the model name makes a lot of sense to me! I'd still like to keep the conventional suffixes (__dbt_backup, __dbt_tmp) as well, for clarity of ownership.

@epapineau
Copy link
Contributor

I'm interested in tackling this as a first issue. After digging into the code some, it looks like it could potentially be handled in a couple of places:

  1. plugins/postgres/dbt/adapters/postgres/relation.py
    Add truncating before known suffixes in the PostgresRelation __post_init__ method
  2. core/dbt/include/global_project/macros/materializations/models/view/view.sql
    Conditional truncating if using the postgres adapter

Do either of these seem like an appropriate approach?

@jtcohen6
Copy link
Contributor

jtcohen6 commented Mar 2, 2022

@epapineau I'd be excited to have you contribute this one!

I think having the truncation/hashing logic live inside the Postgres adapter makes the most sense. Of the two approaches you've recommended, I'd hesitate to add it to the PostgresRelation class itself, since that's used all over, even with the check for known suffixes.

My instinct would be to follow the same approach we took with make_temp_relation:

So I think the move is:

  • Create a make_backup_relation macro that follows the same pattern
  • Update the global view + table materializations to use make_temp_relation + make_backup_relation, instead of hard-coding the suffix logic themselves

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! postgres Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants