Skip to content

tests.system.test_to_gbq: test_series_round_trip[load_parquet-input_series1-api_methods1] failed #439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
flaky-bot bot opened this issue Nov 30, 2021 · 4 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@flaky-bot
Copy link

flaky-bot bot commented Nov 30, 2021

This test failed!

To configure my behavior, see the Flaky Bot documentation.

If I'm commenting on this issue too often, add the flakybot: quiet label and
I will stop commenting.


commit: 928e47b
buildURL: Build Status, Sponge
status: failed

Test output
args = (parent: "projects/precise-truck-742"
read_session {
  data_format: ARROW
  table: "projects/precise-truck-742/dataset...round_trip_73611"
  read_options {
    arrow_serialization_options {
      buffer_compression: LZ4_FRAME
    }
  }
}
,)
kwargs = {'metadata': [('x-goog-request-params', 'read_session.table=projects/precise-truck-742/datasets/python_bigquery_pandas...191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')]}
@functools.wraps(callable_)
def error_remapped_callable(*args, **kwargs):
    try:
      return callable_(*args, **kwargs)

.nox/prerelease/lib/python3.8/site-packages/google/api_core/grpc_helpers.py:66:


self = <grpc._channel._UnaryUnaryMultiCallable object at 0x7f3e623a67f0>
request = parent: "projects/precise-truck-742"
read_session {
data_format: ARROW
table: "projects/precise-truck-742/datasets...s/round_trip_73611"
read_options {
arrow_serialization_options {
buffer_compression: LZ4_FRAME
}
}
}

timeout = None
metadata = [('x-goog-request-params', 'read_session.table=projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system...0191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')]
credentials = None, wait_for_ready = None, compression = None

def __call__(self,
             request,
             timeout=None,
             metadata=None,
             credentials=None,
             wait_for_ready=None,
             compression=None):
    state, call, = self._blocking(request, timeout, metadata, credentials,
                                  wait_for_ready, compression)
  return _end_unary_response_blocking(state, call, False, None)

.nox/prerelease/lib/python3.8/site-packages/grpc/_channel.py:946:


state = <grpc._channel._RPCState object at 0x7f3e3b785ca0>
call = <grpc._cython.cygrpc.SegregatedCall object at 0x7f3e3b7a1c40>
with_call = False, deadline = None

def _end_unary_response_blocking(state, call, with_call, deadline):
    if state.code is grpc.StatusCode.OK:
        if with_call:
            rendezvous = _MultiThreadedRendezvous(state, call, None, deadline)
            return state.response, rendezvous
        else:
            return state.response
    else:
      raise _InactiveRpcError(state)

E grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
E status = StatusCode.UNAVAILABLE
E details = "502:Bad Gateway"
E debug_error_string = "{"created":"@1638300163.180055362","description":"Error received from peer ipv4:74.125.197.95:443","file":"src/core/lib/surface/call.cc","file_line":1063,"grpc_message":"502:Bad Gateway","grpc_status":14}"
E >

.nox/prerelease/lib/python3.8/site-packages/grpc/_channel.py:849: _InactiveRpcError

The above exception was the direct cause of the following exception:

target = functools.partial(<function _wrap_unary_errors..error_remapped_callable at 0x7f3e622a64c0>, parent: "projects/...191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')])
predicate = <function if_exception_type..if_exception_type_predicate at 0x7f3e622a63a0>
sleep_generator = <generator object exponential_sleep_generator at 0x7f3e4aa4a040>
deadline = 600.0, on_error = None

def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
    """Call a function and retry if it fails.

    This is the lowest-level retry helper. Generally, you'll use the
    higher-level retry helper :class:`Retry`.

    Args:
        target(Callable): The function to call and retry. This must be a
            nullary function - apply arguments with `functools.partial`.
        predicate (Callable[Exception]): A callable used to determine if an
            exception raised by the target should be considered retryable.
            It should return True to retry or False otherwise.
        sleep_generator (Iterable[float]): An infinite iterator that determines
            how long to sleep between retries.
        deadline (float): How long to keep retrying the target. The last sleep
            period is shortened as necessary, so that the last retry runs at
            ``deadline`` (and not considerably beyond it).
        on_error (Callable[Exception]): A function to call while processing a
            retryable exception.  Any error raised by this function will *not*
            be caught.

    Returns:
        Any: the return value of the target function.

    Raises:
        google.api_core.RetryError: If the deadline is exceeded while retrying.
        ValueError: If the sleep generator stops yielding values.
        Exception: If the target raises a method that isn't retryable.
    """
    if deadline is not None:
        deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
            seconds=deadline
        )
    else:
        deadline_datetime = None

    last_exc = None

    for sleep in sleep_generator:
        try:
          return target()

.nox/prerelease/lib/python3.8/site-packages/google/api_core/retry.py:190:


args = (parent: "projects/precise-truck-742"
read_session {
data_format: ARROW
table: "projects/precise-truck-742/dataset...round_trip_73611"
read_options {
arrow_serialization_options {
buffer_compression: LZ4_FRAME
}
}
}
,)
kwargs = {'metadata': [('x-goog-request-params', 'read_session.table=projects/precise-truck-742/datasets/python_bigquery_pandas...191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')]}

@functools.wraps(callable_)
def error_remapped_callable(*args, **kwargs):
    try:
        return callable_(*args, **kwargs)
    except grpc.RpcError as exc:
      raise exceptions.from_grpc_error(exc) from exc

E google.api_core.exceptions.ServiceUnavailable: 503 502:Bad Gateway

.nox/prerelease/lib/python3.8/site-packages/google/api_core/grpc_helpers.py:68: ServiceUnavailable

The above exception was the direct cause of the following exception:

method_under_test = functools.partial(<function to_gbq at 0x7f3e688499d0>, project_id='precise-truck-742', credentials=<google.oauth2.service_account.Credentials object at 0x7f3e68768d60>)
random_dataset_id = 'python_bigquery_pandas_tests_system_20211130191234_bc3553'
bigquery_client = <google.cloud.bigquery.client.Client object at 0x7f3e687681c0>
input_series = 0 Skywalker™
1 abc
2 defg
3 hülle
4 信用卡
Name: test_col, dtype: object
api_method = 'load_parquet', api_methods = {'load_csv', 'load_parquet'}

@pytest.mark.parametrize(
    ["input_series", "api_methods"],
    [
        # Ensure that 64-bit floating point numbers are unchanged.
        # See: https://github.com/pydata/pandas-gbq/issues/326
        SeriesRoundTripTestCase(
            input_series=pandas.Series(
                [
                    0.14285714285714285,
                    0.4406779661016949,
                    1.05148,
                    1.05153,
                    1.8571428571428572,
                    2.718281828459045,
                    3.141592653589793,
                    2.0988936657440586e43,
                ],
                name="test_col",
            ),
        ),
        SeriesRoundTripTestCase(
            input_series=pandas.Series(
                [
                    "abc",
                    "defg",
                    # Ensure that unicode characters are encoded. See:
                    # https://github.com/googleapis/python-bigquery-pandas/issues/106
                    "信用卡",
                    "Skywalker™",
                    "hülle",
                ],
                name="test_col",
            ),
        ),
        SeriesRoundTripTestCase(
            input_series=pandas.Series(
                [
                    "abc",
                    "defg",
                    # Ensure that empty strings are written as empty string,
                    # not NULL. See:
                    # https://github.com/googleapis/python-bigquery-pandas/issues/366
                    "",
                    None,
                ],
                name="empty_strings",
            ),
            # BigQuery CSV loader uses empty string as the "null marker" by
            # default. Potentially one could choose a rarely used character or
            # string as the null marker to disambiguate null from empty string,
            # but then that string couldn't be loaded.
            # TODO: Revist when custom load job configuration is supported.
            #       https://github.com/googleapis/python-bigquery-pandas/issues/425
            api_methods={"load_parquet"},
        ),
    ],
)
def test_series_round_trip(
    method_under_test,
    random_dataset_id,
    bigquery_client,
    input_series,
    api_method,
    api_methods,
):
    if api_method not in api_methods:
        pytest.skip(f"{api_method} not supported.")
    table_id = f"{random_dataset_id}.round_trip_{random.randrange(1_000_000)}"
    input_series = input_series.sort_values().reset_index(drop=True)
    df = pandas.DataFrame(
        # Some errors only occur in multi-column dataframes. See:
        # https://github.com/googleapis/python-bigquery-pandas/issues/366
        {"test_col": input_series, "test_col2": input_series}
    )
    method_under_test(df, table_id, api_method=api_method)
  round_trip = bigquery_client.list_rows(table_id).to_dataframe()

tests/system/test_to_gbq.py:117:


.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/table.py:1905: in to_dataframe
record_batch = self.to_arrow(
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/table.py:1708: in to_arrow
for record_batch in self._to_arrow_iterable(
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/table.py:1613: in _to_page_iterable
yield from result_pages
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery/_pandas_helpers.py:823: in _download_table_bqstorage
session = bqstorage_client.create_read_session(
.nox/prerelease/lib/python3.8/site-packages/google/cloud/bigquery_storage_v1/services/big_query_read/client.py:516: in create_read_session
response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,)
.nox/prerelease/lib/python3.8/site-packages/google/api_core/gapic_v1/method.py:154: in call
return wrapped_func(*args, **kwargs)
.nox/prerelease/lib/python3.8/site-packages/google/api_core/retry.py:283: in retry_wrapped_func
return retry_target(


target = functools.partial(<function _wrap_unary_errors..error_remapped_callable at 0x7f3e622a64c0>, parent: "projects/...191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')])
predicate = <function if_exception_type..if_exception_type_predicate at 0x7f3e622a63a0>
sleep_generator = <generator object exponential_sleep_generator at 0x7f3e4aa4a040>
deadline = 600.0, on_error = None

def retry_target(target, predicate, sleep_generator, deadline, on_error=None):
    """Call a function and retry if it fails.

    This is the lowest-level retry helper. Generally, you'll use the
    higher-level retry helper :class:`Retry`.

    Args:
        target(Callable): The function to call and retry. This must be a
            nullary function - apply arguments with `functools.partial`.
        predicate (Callable[Exception]): A callable used to determine if an
            exception raised by the target should be considered retryable.
            It should return True to retry or False otherwise.
        sleep_generator (Iterable[float]): An infinite iterator that determines
            how long to sleep between retries.
        deadline (float): How long to keep retrying the target. The last sleep
            period is shortened as necessary, so that the last retry runs at
            ``deadline`` (and not considerably beyond it).
        on_error (Callable[Exception]): A function to call while processing a
            retryable exception.  Any error raised by this function will *not*
            be caught.

    Returns:
        Any: the return value of the target function.

    Raises:
        google.api_core.RetryError: If the deadline is exceeded while retrying.
        ValueError: If the sleep generator stops yielding values.
        Exception: If the target raises a method that isn't retryable.
    """
    if deadline is not None:
        deadline_datetime = datetime_helpers.utcnow() + datetime.timedelta(
            seconds=deadline
        )
    else:
        deadline_datetime = None

    last_exc = None

    for sleep in sleep_generator:
        try:
            return target()

        # pylint: disable=broad-except
        # This function explicitly must deal with broad exceptions.
        except Exception as exc:
            if not predicate(exc):
                raise
            last_exc = exc
            if on_error is not None:
                on_error(exc)

        now = datetime_helpers.utcnow()

        if deadline_datetime is not None:
            if deadline_datetime <= now:
              raise exceptions.RetryError(
                    "Deadline of {:.1f}s exceeded while calling {}".format(
                        deadline, target
                    ),
                    last_exc,
                ) from last_exc

E google.api_core.exceptions.RetryError: Deadline of 600.0s exceeded while calling functools.partial(<function _wrap_unary_errors..error_remapped_callable at 0x7f3e622a64c0>, parent: "projects/precise-truck-742"
E read_session {
E data_format: ARROW
E table: "projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20211130191234_bc3553/tables/round_trip_73611"
E read_options {
E arrow_serialization_options {
E buffer_compression: LZ4_FRAME
E }
E }
E }
E , metadata=[('x-goog-request-params', 'read_session.table=projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20211130191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')]), last exception: 503 502:Bad Gateway

.nox/prerelease/lib/python3.8/site-packages/google/api_core/retry.py:205: RetryError

@flaky-bot flaky-bot bot added flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 30, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-pandas API. label Nov 30, 2021
@tswast
Copy link
Collaborator

tswast commented Nov 30, 2021

prerelease failure, required for support of google-cloud-bigquery 3.x #426

Marking as feature request, as not currently an issue.

@tswast tswast added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Nov 30, 2021
@flaky-bot
Copy link
Author

flaky-bot bot commented Dec 1, 2021

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (928e47b), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

@flaky-bot flaky-bot bot added flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Dec 1, 2021
@tswast tswast self-assigned this Dec 7, 2021
@tswast tswast added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Dec 7, 2021
@tswast
Copy link
Collaborator

tswast commented Dec 7, 2021

Upon further investigation, this appears to be a genuine flake.

.nox/prerelease/lib/python3.8/site-packages/grpc/_channel.py:946:

state = <grpc._channel._RPCState object at 0x7f3e3b785ca0>

call = <grpc._cython.cygrpc.SegregatedCall object at 0x7f3e3b7a1c40>

with_call = False, deadline = None


def _end_unary_response_blocking(state, call, with_call, deadline):
    if state.code is grpc.StatusCode.OK:
        if with_call:
            rendezvous = _MultiThreadedRendezvous(state, call, None, deadline)
            return state.response, rendezvous
        else:
            return state.response
    else:


      raise _InactiveRpcError(state)


E           grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:

E           	status = StatusCode.UNAVAILABLE

E           	details = "502:Bad Gateway"

E           	debug_error_string = "{"created":"@1638300163.180055362","description":"Error received from peer ipv4:74.125.197.95:443","file":"src/core/lib/surface/call.cc","file_line":1063,"grpc_message":"502:Bad Gateway","grpc_status":14}"

E           >


.nox/prerelease/lib/python3.8/site-packages/grpc/_channel.py:849: _InactiveRpcError


The above exception was the direct cause of the following exception:


target = functools.partial(<function _wrap_unary_errors..error_remapped_callable at 0x7f3e622a64c0>, parent: "projects/...191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')])

predicate = <function if_exception_type..if_exception_type_predicate at 0x7f3e622a63a0>

sleep_generator = <generator object exponential_sleep_generator at 0x7f3e4aa4a040>

deadline = 600.0, on_error = None

@tswast
Copy link
Collaborator

tswast commented Dec 9, 2021

Looks like the request was retried, but still failed after 10 minutes. Not much else we can do.

E                   google.api_core.exceptions.RetryError: Deadline of 600.0s exceeded while calling functools.partial(<function _wrap_unary_errors.<locals>.error_remapped_callable at 0x7f3e622a64c0>, parent: "projects/precise-truck-742"
E                   read_session {
E                     data_format: ARROW
E                     table: "projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20211130191234_bc3553/tables/round_trip_73611"
E                     read_options {
E                       arrow_serialization_options {
E                         buffer_compression: LZ4_FRAME
E                       }
E                     }
E                   }
E                   , metadata=[('x-goog-request-params', 'read_session.table=projects/precise-truck-742/datasets/python_bigquery_pandas_tests_system_20211130191234_bc3553/tables/round_trip_73611'), ('x-goog-api-client', 'gl-python/3.8.12 grpc/1.42.0 gax/2.2.2 gapic/2.10.1')]), last exception: 503 502:Bad Gateway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-pandas API. flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

1 participant