Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple SQL queries in Dataproc SQL job #44890

Merged
merged 7 commits into from
Dec 15, 2024

Conversation

amirmor1
Copy link
Contributor

DataProcJobBuilder.add_query(query) is misleading, because it can make you think that you can call this function multiple times with different queries and then execute and it will send all queries, but it fact it will sent the last one since its override the queries.

I've added set_queries function which takes a list of strings and send it. Dataproc supports queries list.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

DataProcJobBuilder.add_query(query) is misleading, cause it can make you think that you can call this function multiple times with different queries and then execute and it will send all queries, but it fact it will sent the last one since its override the queries.

I've added set_queries function which takes a list of strings and send it.
Dataproc supports queries list.
@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Dec 12, 2024
Amir Mor added 3 commits December 14, 2024 10:38
Requested by reviewer, instead of adding a function that sets a list of queries, just fix the original function add_query that will actually append query to the queries list send to dataproc job.
DataProcJobBuilder.add_query(query) is misleading, cause it can make you think that you can call this function multiple times with different queries and then execute and it will send all queries, but it fact it will sent the last one since its override the queries.

I've added set_queries function which takes a list of strings and send it.
Dataproc supports queries list.
Requested by reviewer, instead of adding a function that sets a list of queries, just fix the original function add_query that will actually append query to the queries list send to dataproc job.
@amirmor1 amirmor1 force-pushed the support-multiple-queries-in-dataproc branch from 7223c00 to 8155625 Compare December 14, 2024 16:28
@shahar1 shahar1 merged commit 9d68013 into apache:main Dec 15, 2024
65 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants