Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(appmesh): add route retry policies #13353

Merged
merged 15 commits into from
Mar 11, 2021
Merged

Conversation

misterjoshua
Copy link
Contributor

@misterjoshua misterjoshua commented Mar 2, 2021

Adds route retry policies for http/http2 and gRPC routes.

Closes #11642


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Mar 2, 2021

@misterjoshua misterjoshua changed the title feat(ecs): add route retry policies feat(appmesh): add route retry policies Mar 2, 2021
@github-actions github-actions bot added the @aws-cdk/aws-appmesh Related to AWS App Mesh label Mar 2, 2021
@misterjoshua
Copy link
Contributor Author

misterjoshua commented Mar 2, 2021

Apologies on the title of the commit 6c35500 - it should say feat(appmesh): add route retry policies, but I'm hesitant to force push now that I've already created the PR.

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great @misterjoshua ! I have some strong opinions on the empty array validation, but not much comments beyond that.

packages/@aws-cdk/aws-appmesh/README.md Show resolved Hide resolved
Comment on lines 347 to 349
grpcRetryEvents: [appmesh.GrpcRetryEvent.DEADLINE_EXCEEDED],
httpRetryEvents: [appmesh.HttpRetryEvent.CLIENT_ERROR],
tcpRetryEvents: [appmesh.TcpRetryEvent.CONNECTION_ERROR],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here - can a GRPC Route retry on HTTP and TCP events?

Copy link
Contributor Author

@misterjoshua misterjoshua Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skinny85 Same as above for HTTP routes. gRPC is layered on HTTP/2 which is layered on TCP, so these routes can specify retry events for all the lower layers.

/**
* The timeout for each retry attempt
*/
readonly perRetryTimeout: cdk.Duration;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just calling this retryTimeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like that, but I do have a nit. The CDK design guidelines want us to use the terminology of the original service documentation. The CloudFormation calls it PerRetryTimeout (gRPC, HTTP). What do you think given this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think retryTimeout is clear enough to not cause any confusion here.

/**
* Internal error
*/
INTERNAL = 'internal',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call this INTERNAL_ERROR then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong objection here.

INTERNAL = 'internal',

/**
* Resource was exhausted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? I feel like more explanation is needed here.

Copy link
Contributor Author

@misterjoshua misterjoshua Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error aligns with gRPC code 8/RESOURCE_EXHAUSTED, which per gRPC documentation means, "Some resource has been exhausted, perhaps a per-user quota, or perhaps the entire file system is out of space." I'm not sure if I'm allowed to copy-paste that verbatim from the docs I just referenced - if I'm not, maybe we can say, "A resource has been exhausted." Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding that link to the gRPC documentation to the JSDocs of that field?

Copy link
Contributor Author

@misterjoshua misterjoshua Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok no problem. I added @see links to each of the enum values.

* permanently unavailable.
* @default - no retries for tcp events
*/
readonly tcpRetryEvents?: TcpRetryEvent[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that either this, or httpRetryEvents is required, needs to be stated in the documentation of this property.

Comment on lines 335 to 341
if (props.retryPolicy.httpRetryEvents && props.retryPolicy.httpRetryEvents.length === 0) {
throw new Error('Specify at least one value in `httpRetryEvents` or leave it undefined');
}

if (props.retryPolicy.tcpRetryEvents && props.retryPolicy.tcpRetryEvents.length === 0) {
throw new Error('Specify at least one value in `tcpRetryEvents` or leave it undefined');
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this validation, and simply return undefined if a customer provided an empty list, in renderHttpRetryPolicy.

No reason to burden the customer with unnecessary errors if we know what to do.

Copy link
Contributor Author

@misterjoshua misterjoshua Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problemo.

Comment on lines 422 to 432
if (props.retryPolicy.grpcRetryEvents && props.retryPolicy.grpcRetryEvents.length === 0) {
throw new Error('Specify at least one value in `grpcRetryEvents` or leave it undefined');
}

if (props.retryPolicy.httpRetryEvents && props.retryPolicy.httpRetryEvents.length === 0) {
throw new Error('Specify at least one value in `httpRetryEvents` or leave it undefined');
}

if (props.retryPolicy.tcpRetryEvents && props.retryPolicy.tcpRetryEvents.length === 0) {
throw new Error('Specify at least one value in `tcpRetryEvents` or leave it undefined');
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (get rid of this validation, and return undefined for an empty list in renderGrpcRetryPolicy ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. No problem.

* gRPC events on which to retry
* @default - no retries for gRPC events
*/
readonly grpcRetryEvents?: GrpcRetryEvent[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that one of this, tcpRetryEvents or httpRetryEvents is required needs to be stated in the documentation of this property.

Comment on lines 375 to 436
test.throws(() => {
appmesh.RouteSpec.http({
weightedTargets: [{ virtualNode }],
retryPolicy: {
httpRetryEvents: [],
tcpRetryEvents: [appmesh.TcpRetryEvent.CONNECTION_ERROR],
maxRetries: 5,
perRetryTimeout: cdk.Duration.seconds(10),
},
});
}, /at least one.*httpRetryEvents/);

test.throws(() => {
appmesh.RouteSpec.http({
weightedTargets: [{ virtualNode }],
retryPolicy: {
httpRetryEvents: [appmesh.HttpRetryEvent.CLIENT_ERROR],
tcpRetryEvents: [],
maxRetries: 5,
perRetryTimeout: cdk.Duration.seconds(10),
},
});
}, /at least one.*tcpRetryEvents/);

test.throws(() => {
appmesh.RouteSpec.grpc({
weightedTargets: [{ virtualNode }],
match: { serviceName: 'servicename' },
retryPolicy: {
grpcRetryEvents: [],
tcpRetryEvents: [appmesh.TcpRetryEvent.CONNECTION_ERROR],
maxRetries: 5,
perRetryTimeout: cdk.Duration.seconds(10),
},
});
}, /at least one.*grpcRetryEvents/i);

test.throws(() => {
appmesh.RouteSpec.grpc({
weightedTargets: [{ virtualNode }],
match: { serviceName: 'servicename' },
retryPolicy: {
httpRetryEvents: [],
tcpRetryEvents: [appmesh.TcpRetryEvent.CONNECTION_ERROR],
maxRetries: 5,
perRetryTimeout: cdk.Duration.seconds(10),
},
});
}, /at least one.*httpRetryEvents/i);

test.throws(() => {
appmesh.RouteSpec.grpc({
weightedTargets: [{ virtualNode }],
match: { serviceName: 'servicename' },
retryPolicy: {
httpRetryEvents: [appmesh.HttpRetryEvent.CLIENT_ERROR],
tcpRetryEvents: [],
maxRetries: 5,
perRetryTimeout: cdk.Duration.seconds(10),
},
});
}, /at least one.*tcpRetryEvents/i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I commented on the production code, I think this is not good behavior.

I would remove all of these tests in favor of checking that providing an empty list results in the appropriate value not being written to the template. You can use the ABSENT constant from the assertion module to assert that the given value is not used - see example usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I didn't know this constant was a thing. Alright. This shouldn't be a problem.

@misterjoshua
Copy link
Contributor Author

misterjoshua commented Mar 4, 2021

@skinny85 I saw your review comments. I'll look at working on this tomorrow when I'm at work. I think I have enough to go on to make some progress, but I'm hoping for some feedback on a few of the review comments. Thanks.

@mergify mergify bot dismissed skinny85’s stale review March 4, 2021 16:25

Pull request has been modified.

Copy link
Contributor

@alexbrjo alexbrjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've addressed Adam's comments. Thanks for the PR, this LGTM!

const tcpRetryEvents = props.retryPolicy.tcpRetryEvents ?? [];

if (grpcRetryEvents.length + httpRetryEvents.length + tcpRetryEvents.length === 0) {
throw new Error('You must specify one value for at least one of `grpcRetryEvents`, `httpRetryEvents`, `tcpRetryEvents`');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You must specify one value for at least one of 'grpcRetryEvents', 'httpRetryEvents' or 'tcpRetryEvents'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I've pushed up a fix for this.

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @misterjoshua , thanks so much for the contribution!

@mergify
Copy link
Contributor

mergify bot commented Mar 11, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@misterjoshua
Copy link
Contributor Author

This is great @misterjoshua , thanks so much for the contribution!

My pleasure!

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: e683201
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

This was referenced Mar 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-appmesh Related to AWS App Mesh
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[appmesh] implement route retry policies
4 participants