-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Snowflake destination: snowflake s3 destination COPY is writing records from different table in the same raw table fix #5924
Conversation
… in the same raw table fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
...tion-jdbc/src/main/java/io/airbyte/integrations/destination/jdbc/copy/s3/S3StreamCopier.java
Show resolved
Hide resolved
@andriikorotkov are you able to add to the description? e.g. logs that you've produced or how the SQL for |
@danieldiamond, I slightly changed the formation of the name for the s3 file and added logs. you can see them in the description. If in your opinion everything looks good - I will merge this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for resolving this. One suggestion for readability
...ake/src/main/java/io/airbyte/integrations/destination/snowflake/SnowflakeS3StreamCopier.java
Outdated
Show resolved
Hide resolved
…5664-snowflake-s3-copy-fix
…5664-snowflake-s3-copy-fix
…5664-snowflake-s3-copy-fix
/test connector=connectors/destination-snowflake
|
…5664-snowflake-s3-copy-fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andriikorotkov is there a test we can add to make sure there are no regressions?
Could you also make an issue to verify this isn't happening on other Dbs?
Both of these can be handled after releasing this PR since it's a critical fix
/publish connector=connectors/destination-snowflake
|
@andriikorotkov I've published a version to unblock @danieldiamond -- so we can probably take care of the unit test in this PR and create the follow up issue for other DBs |
…5664-snowflake-s3-copy-fix
/test connector=connectors/destination-redshift
|
/publish connector=connectors/destination-redshift
|
@sherifnada I have reproduced and fixed this bug for redshift as well. I also made small changes for the tests so that we don't get this error in future destinations. now the problem is in the jobs. Can I merge this pull request if not all jobs have worked correctly? |
@andriikorotkov yup you can move forward |
…5664-snowflake-s3-copy-fix
What
Fix snowflake destination aws s3 Staging COPY is writing records from different table in the same raw table
How
This bug occurs due to the fact that the snowflake does not look for the full path to the file on s3, but uses this path to find all matches in the file path. For example, we have tables
user
anduser_something
. For the table user_something, one file will be found and copying will be done only from it. And for the user table, two files will be found: 1 - identifier/schema/user and 2 - identifier/schema/user_something - because 1 path is part of the second.I fixed this by prefixing the table name in the file path.
Here are the logs for my test tables (
actor
,actor_my
,actor_test
andactor_payment
) in the case when it works without a prefix:And my logs for my test tables (
actor
) in the case when it works with a prefix:Recommended reading order
x.java
y.python
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changes