-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Normalization: Fix sync from HubSpot to MySQL fails with "Row size too large" on create table #10485
Conversation
Bump docker version. Update basic-normalization.md docs.
@sergei-solonitcyn tested backward comparability. Everything works fine. |
Also it's possible to get another issue: |
/test connector=bases/base-normalization
|
/publish connector=bases/base-normalization
|
Since the generated sql files from normalization-mysql is being changed in this PR, the integration tests outputs should also be included in the PR: they will reflect what the actual change in terms of final native SQL queries would be. See #10837 |
elif self.destination_type == DestinationType.MYSQL: | ||
# Cast to `text` datatype. See https://github.com/airbytehq/airbyte/issues/7994 | ||
sql_type = f"{sql_type}(1024)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper way of changing a datatype for a certain destination should be done in dbt macros, not directly in the python code:
see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be causing this error?
Failure Origin: normalization, Message: Normalization failed during the dbt run. This may indicate a problem with the data itself.
1292 (22007): Truncated incorrect CHAR(1024) value:
What
Closes #7994.
Changes default string casting from
varchar(512)
totext
.How
cast(field as char)
which leaded to field varchar(512).varchar(512)
may use as much as 512 * 4 = 2048 bytes in utf8mb4 encoding.marketing_emails
has 38 of those fields).So as a fix, I've updated the mysql create table query to cast sting fields as
cast(field as char(1024))
.Which leads to field text type on created table (
cast(field as text)
is not a valid statement, see https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_cast).Experimentally found that values that is less then 1024 are converted to varchar types and values larger may lead to mediumtext or longtext, which are too large.
Recommended reading order
stream_processor.py
🚨 User Impact 🚨
This would change
varchar
field in mysql tables totext
.