Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: 'test_extract_table' snippet, bucket creation flakes with 500 #5886

Closed
tseaver opened this issue Sep 4, 2018 · 6 comments
Closed
Assignees
Labels
api: bigquery Issues related to the BigQuery API. flaky testing type: process A process-related concern. May include testing, release, or the like.

Comments

@tseaver
Copy link
Contributor

tseaver commented Sep 4, 2018

Similar to #5746, #5747, #5748, but with a 500 error instead of a 429.

See: https://circleci.com/gh/GoogleCloudPlatform/google-cloud-python/7904 (first error in snippets-2-7 run).

______________________________ test_extract_table ______________________________

client = <google.cloud.bigquery.client.Client object at 0x7f2f23608f50>
to_delete = []

    def test_extract_table(client, to_delete):
        from google.cloud import storage
    
        bucket_name = 'extract_shakespeare_{}'.format(_millis())
        storage_client = storage.Client()
>       bucket = retry_429(storage_client.create_bucket)(bucket_name)

../docs/bigquery/snippets.py:1986: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../test_utils/test_utils/retry.py:95: in wrapped_function
    return to_wrap(*args, **kwargs)
../storage/google/cloud/storage/client.py:285: in create_bucket
    bucket.create(client=self, project=project)
../storage/google/cloud/storage/bucket.py:309: in create
    data=properties, _target_object=self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <google.cloud.storage._http.Connection object at 0x7f2f23596050>
method = 'POST', path = '/b', query_params = {'project': 'precise-truck-742'}
data = '{"name": "extract_shakespeare_1536076519795"}'
content_type = 'application/json', headers = None, api_base_url = None
api_version = None, expect_json = True
_target_object = <Bucket: extract_shakespeare_1536076519795>

    def api_request(self, method, path, query_params=None,
                    data=None, content_type=None, headers=None,
                    api_base_url=None, api_version=None,
                    expect_json=True, _target_object=None):
        """Make a request over the HTTP transport to the API.
    
            You shouldn't need to use this method, but if you plan to
            interact with the API using these primitives, this is the
            correct one to use.
    
            :type method: str
            :param method: The HTTP method name (ie, ``GET``, ``POST``, etc).
                           Required.
    
            :type path: str
            :param path: The path to the resource (ie, ``'/b/bucket-name'``).
                         Required.
    
            :type query_params: dict or list
            :param query_params: A dictionary of keys and values (or list of
                                 key-value pairs) to insert into the query
                                 string of the URL.
    
            :type data: str
            :param data: The data to send as the body of the request. Default is
                         the empty string.
    
            :type content_type: str
            :param content_type: The proper MIME type of the data provided. Default
                                 is None.
    
            :type headers: dict
            :param headers: extra HTTP headers to be sent with the request.
    
            :type api_base_url: str
            :param api_base_url: The base URL for the API endpoint.
                                 Typically you won't have to provide this.
                                 Default is the standard API base URL.
    
            :type api_version: str
            :param api_version: The version of the API to call.  Typically
                                you shouldn't provide this and instead use
                                the default for the library.  Default is the
                                latest API version supported by
                                google-cloud-python.
    
            :type expect_json: bool
            :param expect_json: If True, this method will try to parse the
                                response as JSON and raise an exception if
                                that cannot be done.  Default is True.
    
            :type _target_object: :class:`object`
            :param _target_object:
                (Optional) Protected argument to be used by library callers. This
                can allow custom behavior, for example, to defer an HTTP request
                and complete initialization of the object at a later time.
    
            :raises ~google.cloud.exceptions.GoogleCloudError: if the response code
                is not 200 OK.
            :raises ValueError: if the response content type is not JSON.
            :rtype: dict or str
            :returns: The API response payload, either as a raw string or
                      a dictionary if the response is valid JSON.
            """
        url = self.build_api_url(path=path, query_params=query_params,
                                 api_base_url=api_base_url,
                                 api_version=api_version)
    
        # Making the executive decision that any dictionary
        # data will be sent properly as JSON.
        if data and isinstance(data, dict):
            data = json.dumps(data)
            content_type = 'application/json'
    
        response = self._make_request(
            method=method, url=url, data=data, content_type=content_type,
            headers=headers, target_object=_target_object)
    
        if not 200 <= response.status_code < 300:
>           raise exceptions.from_http_response(response)
E           InternalServerError: 500 POST https://www.googleapis.com/storage/v1/b?project=precise-truck-742: Backend Error
@tseaver tseaver added testing api: bigquery Issues related to the BigQuery API. type: process A process-related concern. May include testing, release, or the like. flaky labels Sep 4, 2018
@tseaver
Copy link
Contributor Author

tseaver commented Sep 4, 2018

The snippets-3-6 run in that same CI job has multiple 503 failures, already at points guarded by retry_429:

  • Deleting items during tearDown.
  • Creating a bucket during test_extract_table_json

And a new, unguarded one for bucket.get_blob in test_extract_table.

@shollyman
Copy link
Contributor

Possibly related to https://status.cloud.google.com/incident/storage/18003

@tseaver
Copy link
Contributor Author

tseaver commented Sep 4, 2018

@shollyman Thanks for linking the incident: that is a likely culprit.

@tseaver tseaver changed the title Bigquery: 'test_extract_table' snippet, bucket creation flakes with 500 BigQuery: 'test_extract_table' snippet, bucket creation flakes with 500 Sep 6, 2018
@tseaver
Copy link
Contributor Author

tseaver commented Sep 6, 2018

@tswast, @shollyman Should we just decide to write off those GCS failures as "sunspots"? Or do you think we need to harden our tests against 500 / 503 response codes from GCS (in addition to the 429 we already retry for)?

@tseaver
Copy link
Contributor Author

tseaver commented Sep 6, 2018

Ugh, just happened again.

@tswast
Copy link
Contributor

tswast commented Sep 6, 2018

We could change the bucket to regional instead of multi-regional? We could also put the bucket creation / destruction into a module-level test fixture to avoid doing as many calls to GCS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. flaky testing type: process A process-related concern. May include testing, release, or the like.
Projects
None yet
Development

No branches or pull requests

3 participants