Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(node): add exponential retry mechanism with idempotency headers #4462

Merged
merged 23 commits into from
Dec 1, 2023
Merged

feat(node): add exponential retry mechanism with idempotency headers #4462

merged 23 commits into from
Dec 1, 2023

Conversation

michaldziuba03
Copy link
Contributor

@michaldziuba03 michaldziuba03 commented Oct 9, 2023

What change does this PR introduce?

This PR introduces exponential retry mechanism with idempotency headers for Node SDK.

PR contains:

  • adding idempotency key header for each POST and PATCH request
  • configurable exponential retry mechanism
  • tests

Why was this change needed?

Closes #4452

Other information (Screenshots)

@michaldziuba03
Copy link
Contributor Author

michaldziuba03 commented Oct 10, 2023

@Cliftonz I'm wondering about the retry condition. I think it should be clearly defined in requirements.

What I plan to do:

for idempotent http methods SDK will retry on 50x errors + network errors without response. I will not apply Idempotency-Key header for these idempotent methods.

for non-idempotent http methods SDK will retry only on network errors without response / maybe on 409 Conflict. I think most 50x errors will be replayed from the cache anyway, so request errored with 50x status shouldn't be retried.

@dammy001
Copy link

@Cliftonz I'm wondering about the retry condition. I think it should be clearly defined in requirements.

What I plan to do:

for idempotent http methods SDK will retry on 50x errors + network errors without response. I will not apply Idempotency-Key header for these idempotent methods.

for non-idempotent http methods SDK will retry only on network errors without response / maybe on 409 Conflict. I think most 50x errors will be replayed from the cache anyway, so request errored with 50x status shouldn't be retried.

And also, there should be an option for setting retryCondition callback should from userland.

@Cliftonz
Copy link
Contributor

@Cliftonz I'm wondering about the retry condition. I think it should be clearly defined in requirements.

What I plan to do:

for idempotent http methods SDK will retry on 50x errors + network errors without response. I will not apply Idempotency-Key header for these idempotent methods.

for non-idempotent http methods SDK will retry only on network errors without response / maybe on 409 Conflict. I think most 50x errors will be replayed from the cache anyway, so request errored with 50x status shouldn't be retried.

Just to make sure we are on the same page, these should be configurable by the client.
By default we want RetryMax, WaitMin, WaitMax , and InitialDelay to be resonable values like
RetryMax: 0
Waitmin: 1
Waitmax: 2
InitialDelay: 0.5

As for applying the the idempotency header, you should only do it to POST and PATCH methods however we can add it to all calls its just not preferred.

As for which errors that should be retried, not all errors are considered equal and here are the Errors that should be considered for retrying:
- 408 Request Timeout
- 429 Too Many Requests
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout

Errors that generally should not be retried without addressing the cause:
- 400 Bad Request
- 401 Unauthorized
- 403 Forbidden
- 404 Not Found
- 405 Method Not Allowed
- 413 Payload Too Large
- 414 URI Too Long
- 415 Unsupported Media Type
- 422 Unprocessable Entity

@Cliftonz
Copy link
Contributor

@michaldziuba03 I also agree with @dammy001 we should provide a callback for the user to dynamically select the condition if they so want.

Comment on lines 12 to 30
axios.interceptors.request.use((axiosConfig) => {
// don't attach idempotency key for idempotent methods
if (
axiosConfig.method &&
!NON_IDEMPOTENT_METHODS.includes(axiosConfig.method)
) {
return axiosConfig;
}

const idempotencyKey = axiosConfig.headers['Idempotency-Key'];
// that means intercepted request is retried, so don't generate new idempotency key
if (idempotencyKey) {
return axiosConfig;
}

axiosConfig.headers['Idempotency-Key'] = uuid();

return axiosConfig;
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using axios interceptor to attach idempotency key header to each request.

Idempotency key is attached only to POST and PATCH methods because they are not idempotent by definition.

Comment on lines 64 to 67
// retry on TCP/IP error codes like ECONNRESET
if (isNetworkError(err)) {
return true;
}
Copy link
Contributor Author

@michaldziuba03 michaldziuba03 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isNetworkError is imported from axios-retry library.

As we know HTTP is based on TCP/IP protocol and sometimes we facing issues on this network level.

TCP/IP error typically contains some code like "ECONNRESET" and isNetworkError uses under the hood library is-retry-allowed, because not every network error should be retried.

Comment on lines 77 to 83
if (err.response.status >= 500 && err.response.status <= 599) {
return true;
}

if (RETRIABLE_HTTP_CODES.includes(err.response.status)) {
return true;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On HTTP level, I decided to retry on every 5xx error, 408 (Request Timeout) and 429 (Too Many Requests)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also retry on 422 but regen the idempotency key

@@ -77,7 +81,7 @@
"preset": "ts-jest",
"testEnvironment": "node",
"moduleNameMapper": {
"axios": "axios/dist/node/axios.cjs"
"^axios$": "axios/dist/node/axios.cjs"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without that change, tests for retries don't work.

I observed strange behavior - it sends GET request to 127.0.0.1:80 even when you send POST request to baseUrl, it messes up the request config.

change to ^axios$ fixed that issue. This is also recommendation inside nock readme for axios.

I also tested manually using symlink in testing repo's node_modules and retries work as expected.

@Cliftonz
Copy link
Contributor

@michaldziuba03 Great overview of the PR!

Comment on lines 58 to 68
onRetry(_retryCount, error, requestConfig) {
if (
error.response?.status === 422 &&
requestConfig.headers &&
requestConfig.method &&
NON_IDEMPOTENT_METHODS.includes(requestConfig.method)
) {
requestConfig.headers['Idempotency-Key'] = uuid();
}
},
});
Copy link
Contributor Author

@michaldziuba03 michaldziuba03 Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cliftonz regenerating idempotency key for HTTP status 422 Unprocessable Content

@michaldziuba03 michaldziuba03 marked this pull request as ready for review October 13, 2023 01:55
@michaldziuba03
Copy link
Contributor Author

@Cliftonz @scopsy PR is ready for judgment :)

packages/node/src/lib/novu.ts Outdated Show resolved Hide resolved
packages/node/src/lib/novu.ts Outdated Show resolved Hide resolved
packages/node/src/lib/retries.spec.ts Outdated Show resolved Hide resolved
const result = await novuClient.topics.list({});
expect(result.status).toEqual(200);
});
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test 🙌

expect(result.request.headers['idempotency-key']).toBeUndefined();
});

it('should not retry on common 4xx errors', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: common isn't quite the correct term here, non-recoverable would be more descriptive.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Secondly, this is a good case for using a jest loop to test all relevant 4xx errors. It will make the tests stronger and more declarative.


expect(result.status).toEqual(200);
expect(result.request.headers['idempotency-key']).toBeUndefined();
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is great. Could we add another test using the same format with times(4) and verify that a 500 is received?

packages/node/src/lib/retries.ts Outdated Show resolved Hide resolved
packages/node/src/lib/retries.ts Outdated Show resolved Hide resolved
packages/node/src/lib/retries.ts Outdated Show resolved Hide resolved

if (!err.response) {
return true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear why we're retrying when a response is unavailable. Could you please shed some light on this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's mistake (should return false) - if statements after this check are dependent on err.response (possibly undefined). In new changes I made it look cleaner.

Comment on lines 60 to 82
it('should generate different idempotency-key for each request', async () => {
nock(BACKEND_URL)
.post(TRIGGER_PATH)
.reply(500, { message: 'Server Exception' });

nock(BACKEND_URL)
.post(TRIGGER_PATH)
.times(10)
.reply(201, { acknowledged: true, transactionId: '1003' });

const idempotencyKeys: string[] = [];

for (let i = 0; i < 10; i++) {
const result = await novu.trigger('fake-workflow', {
to: { subscriberId: '123' },
payload: {},
});

idempotencyKeys.push(result.request?.headers['idempotency-key']);
}

expect(allEqual(idempotencyKeys)).toEqual(false);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new test case - I want to make sure that idempotency key is different between requests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 🎉

Comment on lines 5 to 12

const NON_IDEMPOTENT_METHODS = ['post', 'patch'];
const IDEMPOTENCY_KEY = 'Idempotency-Key'; // header key

const DEFAULT_RETRY_MAX = 0;
const DEFAULT_WAIT_MIN = 1;
const DEFAULT_WAIT_MAX = 30;

Copy link
Contributor Author

@michaldziuba03 michaldziuba03 Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new global variables - for default values as suggested in review + I decided to move string value for header key to IDEMPOTENCY_KEY variable (to avoid "magic strings" as it's used in multiple places)

@michaldziuba03 michaldziuba03 requested a review from rifont November 3, 2023 03:13
@michaldziuba03
Copy link
Contributor Author

@rifont


jest.setTimeout(15000);

const allEqual = (arr: Array<string>) => arr.every((val) => val === arr[0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const allEqual = (arr: Array<string>) => arr.every((val) => val === arr[0]);
const hasUniqueOnly = (arr: Array<string>) => Array.from(new Set(arr)).length === arr.length);

Consider the case where every array item except the first is the same, like [1,2,2,2,2]. This function will currently return true, because it's only comparing entries against the first item, not all of the entries.

So we'll need to instead verify that the number of unique entries in the array is equal to the number of total entries. new Set(Array<any>) constructs a Set comprising unique only values. We can then transform the Set to an Array for a length comparison. See https://stackoverflow.com/a/14438954 for reference.

We should also use the convention of is/has (is for singular test against a primitive, has for plural test, like an array) prefix for a boolean function, to provide a better description for the result.

Lastly, it's generally a good idea to also add tests for helper functions. If these helper functions aren't working as expected, how can we be sure that the feature under test is too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with second sentence but I understand where problem is. Function allEqual (currently renamed to hasAllEqual) works well when our goal is to check if all items are equal (it iterates over all elements and compares them to the first element, it also compares first element with itself). This is the goal when we check in tests if idempotency-keys are equal between retries (in the scope of same request).

I agree this function is used in wrong way when we check if idempotency-keys are unique between requests - we expect allEqual to return false. Negation of this function does not guarantee that all items are unique, so this expectation will pass in case of [1,2,2,2,2] scenario, but shouldn't.

I decided to let them coexist :)

await expect(novu.topics.list({})).rejects.toMatchObject({
response: { status: 500 },
});
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job 👏

Comment on lines 60 to 82
it('should generate different idempotency-key for each request', async () => {
nock(BACKEND_URL)
.post(TRIGGER_PATH)
.reply(500, { message: 'Server Exception' });

nock(BACKEND_URL)
.post(TRIGGER_PATH)
.times(10)
.reply(201, { acknowledged: true, transactionId: '1003' });

const idempotencyKeys: string[] = [];

for (let i = 0; i < 10; i++) {
const result = await novu.trigger('fake-workflow', {
to: { subscriberId: '123' },
payload: {},
});

idempotencyKeys.push(result.request?.headers['idempotency-key']);
}

expect(allEqual(idempotencyKeys)).toEqual(false);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 🎉

packages/node/src/lib/retry.ts Outdated Show resolved Hide resolved
packages/node/src/lib/retry.ts Show resolved Hide resolved
@michaldziuba03 michaldziuba03 requested a review from rifont November 9, 2023 04:18
Copy link
Collaborator

@rifont rifont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @michaldziuba03 🙌

Comment on lines +219 to +275
describe('defaultRetryCondition function', () => {
test.each<[number, string]>(NON_RECOVERABLE_ERRORS)(
'should return false when HTTP status is %i',
(status) => {
const err = new HttpError(status);
expect(defaultRetryCondition(err as AxiosError)).toEqual(false);
}
);

test.each<number>(RETRYABLE_HTTP_CODES)(
'should return true when HTTP status is %i',
(status) => {
const err = new HttpError(status);
expect(defaultRetryCondition(err as AxiosError)).toEqual(true);
}
);

it('should return true when HTTP status is 500', () => {
const err = new HttpError(500);
expect(defaultRetryCondition(err as AxiosError)).toEqual(true);
});

it('should return true when network code is ECONNRESET', () => {
const err = new NetworkError('ECONNRESET');
expect(defaultRetryCondition(err as AxiosError)).toEqual(true);
});

it('shoud return false on unknown error', () => {
const err = new Error('Unexpected error');
expect(defaultRetryCondition(err as AxiosError)).toEqual(false);
});
});

describe('hasAllEqual helper function', () => {
it('should return true when all items are equal', () => {
const arr = ['a', 'a', 'a', 'a'];
expect(hasAllEqual(arr)).toEqual(true);
});

it('should return false when items are not equal', () => {
const arr = ['a', 'b', 'b', 'b'];
expect(hasAllEqual(arr)).toEqual(false);
});
});

describe('hasUniqueOnly helper function', () => {
it('should return true when all items are unique', () => {
const arr = ['a', 'b', 'c', 'd'];
expect(hasUniqueOnly(arr)).toEqual(true);
});

it('should return false when items are not unique', () => {
const arr = ['a', 'a', 'c', 'd'];
expect(hasUniqueOnly(arr)).toEqual(false);
});
});
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent tests 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @michaldziuba03 , just a few spellcheck issues to resolve then we are good to merge 🙌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rifont done

@michaldziuba03
Copy link
Contributor Author

@rifont we are waiting for something before merge? Spellcheck issues were resolved week ago.

Let me know if there is something else to do.

@rifont
Copy link
Collaborator

rifont commented Dec 1, 2023

@rifont we are waiting for something before merge? Spellcheck issues were resolved week ago.

Let me know if there is something else to do.

Hi @michaldziuba03 , we had some failing tests on next that we were seeking to fix. However we will now fix these separately. I will get this merged today. Thanks!

@rifont rifont merged commit 4ea758c into novuhq:next Dec 1, 2023
24 of 26 checks passed
@michaldziuba03 michaldziuba03 deleted the nv-3009-add-exponential-retry branch December 1, 2023 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NV-3009] 🚀 Feature: Typescript SDK - Add Exponential Retry Mechanism with Idempotency Headers
4 participants