Postgres: Prevent temp relation identifiers from being too long #2850

elexisvenator · 2020-10-26T00:52:42Z

related #2197

Description

The currently postgres make_temp_relation adds a 29 character suffix to the end of the temp relation identifier (9 from default suffix and 20 from timestamp). This is a problem now that relations with more than 63 characters raise exceptions.
The fix is to shorten the suffix and also trim the base_relation identifier so that the total length is always less than 63 characters.

An exception can also be raised if the default suffix is overridden with a value that is too long.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

jtcohen6

Thanks for this PR, @elexisvenator! Really cleverly done. This will enable us to say, once and for all, that 63 is the number we shall count, and the number of the counting shall be 63, regardless of materialization type.

Quick bits before we can merge this:

Can you add a changelog note?
I would have expected a failing test here. (More specifically, a pass where we don't expect it.) Could you take a quick look at 063_relation_name_tests and adjust so that we can test for (a) passage when a postgres incremental model name is ~60 chars (thanks to this PR), (b) failure when a postgres model name is itself >63 chars?

jtcohen6 · 2020-10-26T14:48:36Z

plugins/postgres/dbt/include/postgres/macros/adapters.sql

+ {% if suffix_length > relation_max_name_length %}
+ {% do exceptions.raise_compiler_error('Temp relation suffix is too long (' ~ suffix|length ~ ' characters). Maximum length is ' ~ (relation_max_name_length - dtstring|length) ~ ' characters.') %}
+ {% endif %}
+ {% set tmp_identifier = base_relation.identifier[:relation_max_name_length - suffix_length] ~ suffix ~ dtstring %}


I hesitate about replacing one silent truncation (by postgres) with another (by dbt), but I suppose this is only for temp relation naming, for use mid-materialization, and it won't have any effect on how users actually reference the final model

elexisvenator · 2020-10-26T22:13:48Z

This will enable us to say, once and for all, that 63 is the number we shall count, and the number of the counting shall be 63, regardless of materialization type.

Almost, the actual limit is 51 due to most materializations creating persisted backup tables with the suffix __dbt_backup

I will add the changes tonight 👍

jtcohen6 · 2020-10-27T14:13:47Z

Almost, the actual limit is 51 due to most materializations creating persisted backup tables with the suffix __dbt_backup
make_temp_relation

Yes, that's silly and certainly for historical reasons. We should just use make_temp_relation in the table and view materializations as well, rather than rehashing all the logic for tmp_identifier and intermediate_relation.

If that's a change you wanted to address in this PR, I'd support it. You certainly don't have to, though, it's at the edge of the scope.

elexisvenator · 2020-10-27T22:12:33Z

I'm less certain about touching backup tables. They are persisted and I frequently see them continuing to exist especially if dbt fails for some reason during a run. If they have a unique generated name then it wouldn't be possible to clean these up automatically.

jtcohen6 · 2020-10-28T03:28:21Z

Ok, that's a fair point. I wouldn't want to give globally unique names to backup_relation, only to intermediate_relation, but there's still the matter of the suffix length. I'd like to give this more dedicated thought, but I think that's going to be a while from now, when it's finally necessary and proper to clean up our builtin materializations.

For the time being, then, 51 shall be the order of the day.

Related: dbt-labs#2197 The currently postgres `make_temp_relation` adds a 29 character suffix to the end of the temp relation identifier (9 from default suffix and 20 from timestamp). This is a problem now that relations with more than 63 characters raise exceptions. The fix is to shorten the suffix and also trim the base_relation identifier so that the total length is always less than 63 characters. An exception can also be raised if the default suffix is overridden with a value that is too long.

elexisvenator · 2020-11-03T10:00:17Z

I updated the integration tests to check up to 51 characters, and do a separate check that triggers make_temp_relation in an incremental model. There is also a skipped test that would pass if a way to deal with the __dbt_backup is found.

jtcohen6

Thank you so much for this contribution @elexisvenator!

cla-bot bot added the cla:yes label Oct 26, 2020

jtcohen6 reviewed Oct 26, 2020

View reviewed changes

elexisvenator added 3 commits November 3, 2020 20:56

Add integration tests

eff198d

Update changelog

2cd56ca

elexisvenator force-pushed the patch-1 branch from 2e8b6d4 to 2cd56ca Compare November 3, 2020 09:58

elexisvenator requested a review from jtcohen6 November 3, 2020 22:53

jtcohen6 mentioned this pull request Nov 8, 2020

Relation name '*__dbt_tmp' is longer than 63 characters #2869

Closed

5 tasks

jtcohen6 approved these changes Nov 9, 2020

View reviewed changes

jtcohen6 merged commit dcc32dc into dbt-labs:dev/kiyoshi-kuromiya Nov 9, 2020

elexisvenator deleted the patch-1 branch November 12, 2020 00:01

epapineau mentioned this pull request Apr 8, 2022

Truncate relation names when appending a suffix #4921

Merged

4 tasks

MattTriano mentioned this pull request May 6, 2023

Remove the report schema and its models MattTriano/analytics_data_where_house#122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Postgres: Prevent temp relation identifiers from being too long #2850

Postgres: Prevent temp relation identifiers from being too long #2850

elexisvenator commented Oct 26, 2020 •

edited

Loading

jtcohen6 left a comment

jtcohen6 Oct 26, 2020

elexisvenator commented Oct 26, 2020

jtcohen6 commented Oct 27, 2020

elexisvenator commented Oct 27, 2020

jtcohen6 commented Oct 28, 2020 •

edited

Loading

elexisvenator commented Nov 3, 2020

jtcohen6 left a comment

Postgres: Prevent temp relation identifiers from being too long #2850

Postgres: Prevent temp relation identifiers from being too long #2850

Conversation

elexisvenator commented Oct 26, 2020 • edited Loading

Description

Checklist

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Oct 26, 2020

Choose a reason for hiding this comment

elexisvenator commented Oct 26, 2020

jtcohen6 commented Oct 27, 2020

elexisvenator commented Oct 27, 2020

jtcohen6 commented Oct 28, 2020 • edited Loading

elexisvenator commented Nov 3, 2020

jtcohen6 left a comment

Choose a reason for hiding this comment

elexisvenator commented Oct 26, 2020 •

edited

Loading

jtcohen6 commented Oct 28, 2020 •

edited

Loading