Skip to content

Credential expired during retry #3408

Open
@f400810-freddiemac

Description

@f400810-freddiemac

Describe the bug

In RetryableStage execute method, the "AwsCredentails" does not attempt to renew if it has expired. Therefore, if a method called with the existing credential is expiring soon, the number of retry is less than intended due to the expiration of the credential.

Expected Behavior

For retry with EqualJitterBackoffStrategy, expect an expired credential will be renew during retry.

Current Behavior

If a request (in our case S3Client.getObject) failed with s retryable Exception and the credential expired between two retry, we got a S3Exception before the retry limit reached.

software.amazon.awssdk.services.s3.model.S3Exception: The provided token has expired. (Service: S3, Status Code: 400, Request ID: 3YWKVBNJPNTXPJX2, Extended Request ID: GkR56xA0r/Ek7zqQdB2ZdP3wqMMhf49HH7hc5N2TAIu47J3HEk6yvSgVNbX7ADuHDy/Irhr2rPQ=)
        at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:123)
        at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:79)
        at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:59)
        at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:40)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:64)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:34)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
        at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
        at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:135)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:161)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$0(BaseSyncClientHandler.java:84)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:169)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:62)
        at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52)
        at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:62)
        at software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:4371)
        at com.freddiemac.fe.distributed.computing.grid.aws.S3Bucket.getObject(S3Bucket.java:131)
        at com.freddiemac.fe.distributed.computing.grid.aws.S3Bucket.getTaskOutput(S3Bucket.java:112)
        at com.freddiemac.fe.distributed.computing.grid.aws.Job$Ready.readTask(Job.java:70)
        at com.freddiemac.fe.distributed.computing.grid.aws.Job.readTask(Job.java:257)
        at com.freddiemac.fe.distributed.computing.grid.aws.AwsProcessor.lambda$completeResponse$3(AwsProcessor.java:135)
        at com.freddiemac.fe.distributed.computing.grid.api.CompletionHandler.handle(CompletionHandler.java:223)
        at com.freddiemac.fe.distributed.computing.grid.api.CompletionHandler.lambda$newTaskHandler$18(CompletionHandler.java:213)
        at com.freddiemac.fe.distributed.computing.grid.api.GridClient.lambda$convertToBiFunction$2(GridClient.java:169)
        at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)
        at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
        at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:443)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

Reproduction Steps

To consistently reproduce the issue, we create our own implementation of ResponseTransformer that always throw RetryableException. And setup retry policy that will retry pass the credential expired (In our case, the credential has an hour of life. So we setup retry policy go over an hour). Then we call S3Client.getObject with our ResponseTransformer implementation. In stead of failed with reaching the retry limit, we got the S3Exception with the provided token has expired.

Possible Solution

For every retry, the request may call AwsCredentialsProvider resolveCredentails to ensure the freshness of the credential

Additional Information/Context

No response

AWS Java SDK version used

2.16.104

JDK version used

1.8.0_181

Operating System and version

Redhat 7.9

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions