Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Client.query modifies job_config object #9727

Closed
tswast opened this issue Nov 11, 2019 · 0 comments · Fixed by #9735
Closed

BigQuery: Client.query modifies job_config object #9727

tswast opened this issue Nov 11, 2019 · 0 comments · Fixed by #9735
Assignees
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@tswast
Copy link
Contributor

tswast commented Nov 11, 2019

Steps to reproduce

  1. Create a QueryJobConfig object.
  2. Pass the object to the Client.query method.
  3. Observe that the configuration object has changed.

Code example

from google.cloud import bigquery

client = bigquery.Client()
config = bigquery.QueryJobConfig(dry_run=True)
job = client.query(
    """
    SELECT name, SUM(number) AS total_people
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    """,
    job_config=config
)
print(job.total_bytes_processed)

config.dry_run = False
job = client.query(
    """
    SELECT name, SUM(number) AS total_people
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    """,
    job_config=config
)
job.result()

Stack trace

---------------------------------------------------------------------------
BadRequest                                Traceback (most recent call last)
<ipython-input-17-97f4f20faa6a> in <module>
----> 1 job.result()

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/cloud/bigquery/job.py in result(self, timeout, page_size, retry, max_results)
   2937         """
   2938         try:
-> 2939             super(QueryJob, self).result(timeout=timeout)
   2940 
   2941             # Return an iterator instead of returning the job.

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/cloud/bigquery/job.py in result(self, timeout, retry)
    732             self._begin(retry=retry)
    733         # TODO: modify PollingFuture so it can pass a retry argument to done().
--> 734         return super(_AsyncJob, self).result(timeout=timeout)
    735 
    736     def cancelled(self):

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/api_core/future/polling.py in result(self, timeout)
    125             # pylint: disable=raising-bad-type
    126             # Pylint doesn't recognize that this is valid in this case.
--> 127             raise self._exception
    128 
    129         return self._result

BadRequest: 400 Cannot explicitly modify anonymous table swast-scratch:_def2aa82a75fc33513bfb65968607e1f49148f83.anon561d4b35377f56c5960a4a49be253d3c90c120db

(job ID: 2692472c-1213-4716-8393-d9c49fd7b678)

                -----Query Job SQL Follows-----                

    |    .    |    .    |    .    |    .    |    .    |
   1:
   2:    SELECT name, SUM(number) AS total_people
   3:    FROM `bigquery-public-data.usa_names.usa_1910_current`
   4:    GROUP BY name
   5:    
    |    .    |    .    |    .    |    .    |    .    |

When you run a job, the configuration object may be modified. In this example, it sets the destination table. It is unexpected that the original Python object would be modified.

I propose that the query (and other "job" methods on client) be updated to make a deep copy of the the job_config argument before passing it on to the job constructor.

@tswast tswast added api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants