Skip to content

When client exhausts retry attempts for any API call, it does not make retries for subsequent API calls #4754

Closed
@trivikr

Description

@trivikr

Checkboxes for prior research

Describe the bug

When client exhausts retry attempts for any API call, it does not make retries for subsequent API calls

SDK version number

All >=v3.229.0, used v3.335.0 in testing

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

All, used v16.20.0 in testing

Reproduction Steps

Here is a repro which adds custom Handler which returns a Timeout for every call, and a retryStrategy with just 100ms delay between retries.

import { Kinesis } from "@aws-sdk/client-kinesis"; // v3.335.0
import { NodeHttp2Handler } from "@aws-sdk/node-http-handler";
import { ConfiguredRetryStrategy } from "@aws-sdk/util-retry";

// Simlulate a timeout error.
class NodeHttp2HandlerReturnsTimeout extends NodeHttp2Handler {
  async handle(request, options) {
    const timeoutError = new Error("Request timed out");
    timeoutError.name = "TimeoutError";
    throw timeoutError;
  }
}

const client = new Kinesis({
  region: "us-west-2",
  requestHandler: new NodeHttp2HandlerReturnsTimeout(),
  retryStrategy: new ConfiguredRetryStrategy(
    2, // max attempts.
    () => 100 // delay only for 100ms between retries to speed up client side rate limiting.
  ),
});

let calls = 0;
let attempts = 0;
while (true) {
  try {
    await client.listStreams({});
  } catch (error) {
    const currentCallAttempts = error.$metadata.attempts;
    
    calls += 1;
    attempts += currentCallAttempts;
    console.log(`Total: ${calls} calls, ${attempts} attempts`);

    if (error.$metadata.attempts === 1) {
      // No more retries can be made.
      break;
    }
  }
}

To log attempts, the console.log was added in node_modules/@aws-sdk/util-retry/dist-cjs/StandardRetryStrategy.js

// ...
    shouldRetry(tokenToRenew, errorInfo, maxAttempts) {
        const attempts = tokenToRenew.getRetryCount();
        console.log({ attempts, maxAttempts, errorInfo });
        return (attempts < maxAttempts &&
            tokenToRenew.hasRetryTokens(errorInfo.errorType) &&
            this.isRetryableError(errorInfo.errorType));
    }
// ...

Observed Behavior

When the listStreams API call is made the second time, the previous value of attempts, i.e 2 is used, and no retries are made.

{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 1 calls, 3 attempts
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 2 calls, 4 attempts

Expected Behavior

The API calls are repeatedly made till retry tokens are actually exhausted.

{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 1 calls, 3 attempts
{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 2 calls, 6 attempts
{ attempts: 0, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 1, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
{ attempts: 2, maxAttempts: 2, errorInfo: { errorType: 'TRANSIENT' } }
Total: 3 calls, 9 attempts
// ... and so on

Possible Solution

Needs deep dive, but it looks like StandardRetryStrategy should return new retryToken when acquire is called.

   }
 
   public async acquireInitialRetryToken(retryTokenScope: string): Promise<StandardRetryToken> {
-    return this.retryToken;
+    return getDefaultRetryToken(INITIAL_RETRY_TOKENS, DEFAULT_RETRY_DELAY_BASE);
   }
 
   public async refreshRetryTokenForRetry(

Additional Information/Context

This issue was discovered while working on #4753

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.p0This issue is the highest priorityqueuedThis issues is on the AWS team's backlog

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions