Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix S3ToRedshiftOperator #19358

Merged
merged 3 commits into from
Nov 3, 2021
Merged

Conversation

mariotaddeucci
Copy link
Contributor

@mariotaddeucci mariotaddeucci commented Nov 2, 2021

Bug happens on S3ToRedshiftOperator with specific configuration. By using "UPSERT" or "REPLACE" is generate an sql block with multiple queries. The RedshiftSQLHook don't support execute multiple queries in a single call of execute. To fix it just need to convert the single query string to a list of queries.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Nov 2, 2021
@mariotaddeucci mariotaddeucci changed the title Fix S3 to redshift operator Fix S3ToRedshiftOperator Nov 2, 2021
Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Nov 2, 2021
@github-actions
Copy link

github-actions bot commented Nov 2, 2021

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@john-jac
Copy link
Contributor

john-jac commented Nov 2, 2021

Bug happens on S3ToRedshiftOperator with specific configuration. By using "UPSERT" or "REPLACE" is generate an sql block with multiple queries. The RedshiftSQLHook don't support execute multiple queries in a single call of execute. To fix it just need to convert the single query string to a list of queries.

Hi @mariotaddeucci . What Redshift configuration causes the issue?

@mariotaddeucci
Copy link
Contributor Author

@john-jac by using UPSERT or REPLACE with new RedshiftSqlHook (changed to remove postgres provider depency), the operator executes multiple queries in a transaction. Different from postgres hook, RedshiftSqlHook don't support multiple commands into a prepared statement, to fix the operator was necessary change the string multi query to a list of queries.

@potiuk potiuk merged commit 6148ddd into apache:main Nov 3, 2021
@mariotaddeucci mariotaddeucci deleted the s3-to-redshift-fixes branch November 3, 2021 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers okay to merge It's ok to merge this PR as it does not require more tests provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants