-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Unary Callables Deadline values respect the TotalTimeout in RetrySettings #1603
Conversation
gax-java/gax-httpjson/src/main/java/com/google/api/gax/httpjson/HttpRequestRunnable.java
Outdated
Show resolved
Hide resolved
gax-java/gax-httpjson/src/main/java/com/google/api/gax/httpjson/HttpRequestRunnable.java
Outdated
Show resolved
Hide resolved
@@ -88,6 +93,7 @@ | |||
private final ApiMethodDescriptor<RequestT, ResponseT> methodDescriptor; | |||
private final HttpTransport httpTransport; | |||
private final Executor executor; | |||
private final ScheduledExecutorService deadlineCancellationExecutor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to reuse the executor
above? I know it's not a ScheduledExecutorService
at this point, but the what's being passed in is from here, which is indeed a ScheduledExecutorService
, so we should be able to change the type and reuse it. I did a quick search and I don't think most of the change would be breaking, there is one public builder method that could be considered breaking.
The reason we may want to reuse the existing executor is that we can leverage the existing client lifecycles for shutting down the executor, otherwise we need to explicitly shut down this new deadlineCancellationExecutor
somewhere to prevent from resource leaking, which is missing from this PR currently. Reusing the existing executor may also have other issues like the ThreadPool is not large enough so new requests may have to wait for new thread. The bottom line is that we need to be careful with anything related to executors, so we have enough resource for large number of concurrent requests and we don't leak resource accidentally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, makes sense. I'll take another look into how grpc handles the multiple executors. I'd imagine they would also possibly run into this issue.
// there is a timeout exception from this RPC call (DEADLINE_EXCEEDED) | ||
private void closeAndNotifyListeners() { | ||
synchronized (lock) { | ||
close( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment in close
method makes me worried that we may still have to wait for the server to fully send the response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the cancellation in the runnable doesn't seem like it's actually cancel's the runnable. I'll look into the possibility of being able to actually disconnect the connection instead of waiting for the socket timeout.
// will occur immediately). For any other value, the deadlineScheduler will | ||
// terminate in the future (even if the timeout is small). | ||
@InternalApi | ||
protected boolean shouldRPCTerminate(long timeLeftMs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this new method now that we are using ms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is here due the the very subtle difference between unary/streaming vs LRO retry logic. Polling from LRO's should retry even if timeLeftMs == 0 (as it would be the final polling attempt), but unary/streaming callables should not (and it would signify an rpc timeout attempt of 0)
@Test | ||
public void testGRPC_LROSuccessfulResponse_NoDeadlineExceeded() | ||
throws IOException, ExecutionException, InterruptedException { | ||
EchoStubSettings.Builder grpcEchoSettingsBuilder = EchoStubSettings.newBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These setups are 40 lines long and made the tests hard to read, can we extract the common set ups to a private method or setup method or Util class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep, will do.
public class ITLongRunningOperation { | ||
|
||
@Test | ||
public void testGRPC_LROSuccessfulResponse_NoDeadlineExceeded() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the test name still reflect what we are testing? Or maybe we need to make it clearer?
I see NoDeadlineExceeded
which assumes not retry, but I see an assertion in the end that asserts attemptCount
to be at least 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep, I can update the test names to better clarity. Some tests cases are testing that the deadline exceeded and that it throws a DEADLINE_EXCEEDED exception and it's very easy to confuse.
// Explicitly set retries as disabled (maxAttempts == 1) | ||
.setMaxAttempts(1) | ||
.build(); | ||
EchoStubSettings.Builder grpcEchoSettingsBuilder = EchoStubSettings.newBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing for the setups in this class, it would be great if we can simplify them. Maybe we can expose a method in TestClientInitializer that takes RetrySettings
and RetryableCodes
as parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep will do! That's a good idea.
showcase/gapic-showcase/src/test/java/com/google/showcase/v1beta1/it/ITUnaryDeadline.java
Show resolved
Hide resolved
* <p>Each test attempts to get the number of attempts done in each call. The attemptCount is | ||
* incremented by 1 as the first attempt is zero indexed. | ||
*/ | ||
public class ITUnaryDeadline { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of this PR. It would be great if we can programmatically start/stop the showcase server so we can test things like connect/socket timeouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can raise in the showcase repo and see what the others think (or see if that's possible / easily done).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant more like doing it in our own repo, but yeah asking others if it has been done before is a good idea.
if (newDeadline != null) { | ||
builder.setDeadline(newDeadline); | ||
if (inputOptions.getTimeout() != null) { | ||
Duration newTimeout = java.time.Duration.ofNanos(inputOptions.getTimeout().getNano()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use ms instead of nanos now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
[gapic-generator-java-root] Kudos, SonarCloud Quality Gate passed! |
[java_showcase_integration_tests] SonarCloud Quality Gate failed. |
[java_showcase_unit_tests] Kudos, SonarCloud Quality Gate passed! |
We should be able to cancel the runnable/ close the call exactly when the RPC timeout has elapsed.
Fixes: #1504