-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 MySQL destination: normalization #4163
Conversation
934481e
to
47945bf
Compare
/test connector=bases/base-normalization
|
@ChristopheDuong, I am running into this error in the MySQL integration test:
Do you know why the truncation is not working? It worked in the unit tests. Here is the logs. |
It seems to be working fine, for example on table names created by normalization: but it is not applied to names of raw destination table names that are given as input tables to normalization: Do you know how would |
After downloading and looking at the output log from the
So I would say that you need to make sure the destination can implement some logic to fit long stream names successfully and reproduce the same logic in normalization when guessing the input table name... (you are experiencing here one of the pain points I was writing about in my refactoring of Configured Catalog document) |
airbyte-integrations/bases/base-normalization/unit_tests/test_table_name_registry.py
Outdated
Show resolved
Hide resolved
...zation/integration_tests/resources/test_primary_key_streams/dbt_schema_tests/schema_test.yml
Show resolved
Hide resolved
/test connector=bases/base-normalization
|
@ChristopheDuong, this PR is ready for review. However, I was not able to do the following with this PR:
Can we merge this PR as is for now? I can create issues to track the remaining problems. I plan to check in the new mysql test output after the code review is done, because there are lots of files and can make it harder to review the code. But I push them if you prefer to review those files as well. |
/test connector=bases/base-normalization
|
It seems @marcosmarxm added an extra test case with the dedup fix with CDC... It is so weird, MySQL runs this query fine: but fails with this because the column (note that |
...ons/bases/base-normalization/dbt-project-template/macros/cross_db_utils/type_conversions.sql
Outdated
Show resolved
Hide resolved
# In MySQL, the max number of columns is limited by row size (8KB), | ||
# not by absolute column count. It is way fewer than 1500. | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to be solved in this PR but we should probably handle the validity of rows (in terms of size) in the destination-mysql
and log some warning/errors or reject such records?
About 250 columns, seems to be really low, especially for sources like hubspot/salesforce and they could easily run into exceptions there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max number of columns is related to the size of each column. In the test, we are creating string columns, which corresponds to char
columns in mysql and each column is relatively large. That's why as few as 250 columns can fail the test. In reality, not every column will be of type char
, and the max is 4096 in mysql.
The root cause of the casting failure is that when using |
/test connector=bases/base-normalization
|
/test connector=bases/base-normalization
|
good! don't forget to bump and publish MySQL destination and the normalization image then |
/publish connector=bases/base-normalization
|
What
How
Describe the solution
Todo
Append hash to table nameInvestigate dependency specificationdbt-mysql
toDockerfile
does not solve the problempip install dbt-mysql
there still resulted in missing dependency exception related todbt
.dbt
version to 0.19.1dbt-mysql
requiresdbt
0.19.0dbt-mysql
that depends ondbt
0.19.1 resulted in acharset
issue and test failures.Recommended reading order
x.java
y.python
Pre-merge Checklist
Expand the checklist which is relevant for this PR.
airbyte_secret
in output spec./gradlew :airbyte-integrations:connectors:<name>:integrationTest
./test connector=connectors/<name>
command as documented here is passing.docs/integrations/
directory./publish
command described here