Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retry parameter ignored in the operation.result(retry=retry) call #458

Closed
medb opened this issue Oct 4, 2022 · 5 comments
Closed

retry parameter ignored in the operation.result(retry=retry) call #458

medb opened this issue Oct 4, 2022 · 5 comments
Assignees
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@medb
Copy link

medb commented Oct 4, 2022

Environment details

  • OS type and version: Debian 5.18.16
  • Python version: Python 3.10.7
  • pip version: pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)
  • google-api-core version: 2.10.1

Steps to reproduce

  1. Use any GCP API that returns LRO
  2. Use operation.result method call with custom retry parameter
  3. This method will return ~60 seconds after actual operation finishes, while it should not have more than 10 seconds of overhead with custom retry policy (see below)

Code example

from google.api_core import retry
from google.cloud import dataproc_v1
from google.api_core.client_options import ClientOptions

import time

project = '. . .'
region = 'us-central1'

# Create Dataproc batch
client_options = ClientOptions(api_endpoint="{}-dataproc.googleapis.com:443".format(region))
client = dataproc_v1.BatchControllerClient(client_options=client_options)
batch = dataproc_v1.Batch()
batch.spark_batch.main_class = 'org.apache.spark.examples.SparkPi'
batch.spark_batch.jar_file_uris = [
  'file:///usr/lib/spark/examples/jars/spark-examples.jar',
]
batch.spark_batch.args = ['1']
batch.runtime_config.properties = {
  'spark.executor.instances': '2',
}

parent = f"projects/{project}/locations/{region}"
request = dataproc_v1.CreateBatchRequest(
  parent=parent,  # type: ignore
  batch=batch,  # type: ignore
)

# Make the request
print("Creating batch")
operation = client.create_batch(request=request)  # type: ignore
print("Batch created")

# This takes quite a while, waiting on GCP response to resolve
retry = retry.Retry(initial=10, maximum=10, multiplier=1.0, deadline=600)
start = time.time()
response = operation.result(retry=retry)
print("Batch finished")
print(f"Took {time.time() - start} seconds to finish batch job")
print(response)

After replacing response = operation.result(retry=retry) call with custom wait/retry logic operation completion will be detected as expected:

while not operation.done(retry=None):
  time.sleep(10)
response = operation.metadata
@medb medb changed the title retry parameter ignored in the operation.result(retry=retry) call retry parameter ignored in the operation.result(retry=retry) call Oct 4, 2022
@medb
Copy link
Author

medb commented Oct 4, 2022

I think that this issue is caused by the fact that custom retry passed by caller via kwargs:

kwargs = {} if retry is DEFAULT_RETRY else {"retry": retry}
retry_(self._done_or_raise)(**kwargs)

But it's not used for actual retries:

sleep_generator = exponential_sleep_generator(
self._initial, self._maximum, multiplier=self._multiplier
)

@parthea parthea added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Oct 4, 2022
@parthea
Copy link
Collaborator

parthea commented Oct 4, 2022

Googlers see internal issue b/248606796

@vam-google vam-google self-assigned this Oct 4, 2022
@vam-google
Copy link
Contributor

This should fix it: #462.

@dataders
Copy link

dataders commented Nov 1, 2022

This should fix it: #462.

@ChenyuLInx have you had a chance to try this fix out yet?

@parthea
Copy link
Collaborator

parthea commented Dec 9, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

4 participants