Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Lambdas to create and update in parallel #3976

Merged
merged 10 commits into from
Jun 7, 2024

Conversation

flostadler
Copy link
Contributor

@flostadler flostadler commented May 23, 2024

The root cause behind why the serialization of Lambda creation/update was introduced upstream is excessive memory usage (see hashicorp/terraform#9364).
After investigation we found that this is caused by the HTTP request logging middleware. It logs the lambda archive as a base64 encoded string. In order to do so, multiple copies of the body are created in memory, which leads to memory bloating.
This change fixes that by redacting the body in the logs for the Create/Update Lambda calls.

The PR introduces two patches. One removes the Lambda serialization and the other fixes the HTTP request logging middleware for the Lambda CreateFunction and UpdateFunctionCode operations.
After this, Lambdas are created/updated in parallel and don't suffer from excessive memory usage. Users can still limit the parallelism with the CLI flag --parallel if they wish so.

Relates to #2206

@flostadler flostadler self-assigned this May 23, 2024
@flostadler flostadler marked this pull request as draft May 23, 2024 09:14
Copy link

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

Maintainer note: consult the runbook for dealing with any breaking changes.

…mbda code archive

When creating lambda functions and directly uploading the code, then the whole archive
is being logged as a base64 encoded string as part of the HTTP request logger.
In order to do so, multiple copies of the body are created in memory, which leads
to memory bloating.
This change fixes that by redacting the body in the logs for the Create/Update Lambda
calls.
@flostadler flostadler requested a review from a team May 31, 2024 15:14
@flostadler flostadler marked this pull request as ready for review May 31, 2024 15:14
@flostadler flostadler requested a review from t0yv0 May 31, 2024 15:30
+ return &wrappedRequestResponseLogger{wrapped: wrapped}
+}
+
+//go:linkname decomposeHTTPResponse github.com/hashicorp/aws-sdk-go-base/v2.decomposeHTTPResponse
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna highlight this. If we don't do this, we need to fork aws-sdk-go-base or not log the response at all.

In case upstream changes the symbol, we catch it during compile time

@mjeffryes
Copy link
Member

This could be a nice win for our users, but it might be worth proposing an upstream PR before making a patch.

@flostadler
Copy link
Contributor Author

flostadler commented May 31, 2024

This could be a nice win for our users, but it might be worth proposing an upstream PR before making a patch.

I'm already working on preparing an upstream PR, but I'd propose that we still go forward with the patch for the time being because this change will touch two different repos upstream (hashicorp/aws-sdk-go-base & hashicorp/terraform-provider-aws).
My hunch is that this will take a substantial amount of time to get merged in (even small patches for panics take weeks). And in case it goes faster we can remove the patch again. Wdyt?

+ // lambda code. Logging the lambda code leads to memory bloating because it allocates a lot of copies of the
+ // body
+ o.APIOptions = append(o.APIOptions, func(stack *middleware.Stack) error {
+ loggingMiddleware, err := stack.Deserialize.Remove("TF_AWS_RequestResponseLogger")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the name of the logging middleware ever changes it'll fail during the integration tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice !!

}

if v, ok := d.GetOk("filename"); ok {
- // Grab an exclusive lock so that we're only reading one function into memory at a time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

+//go:linkname decomposeHTTPResponse github.com/hashicorp/aws-sdk-go-base/v2.decomposeHTTPResponse
+func decomposeHTTPResponse(ctx context.Context, resp *http.Response, elapsed time.Duration) (map[string]any, error)
+
+func (r *wrappedRequestResponseLogger) HandleDeserialize(ctx context.Context, in middleware.DeserializeInput, next middleware.DeserializeHandler,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing this was inlined and edited from upstream ? I think that's great and worth doing, just would be helpful to indicate that in the comment, and highlight the edits, just in case. Since we forked it the upstream changes to this handler will not propagate anymore but that sounds ok.

Copy link
Member

@t0yv0 t0yv0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is massively helpful. Apologies for the long review cycles here, I should have gotten to this much sooner.

@t0yv0
Copy link
Member

t0yv0 commented Jun 4, 2024

It fixes #2206 right or there's some work remaining?

@flostadler
Copy link
Contributor Author

It fixes #2206 right or there's some work remaining?

It fixes it! The added test would fail without the patches

@flostadler flostadler changed the title Prototype: Remove lambda serialization Change Lambdas to create and update in parallel Jun 7, 2024
@flostadler flostadler merged commit 42cf551 into master Jun 7, 2024
24 checks passed
@flostadler flostadler deleted the 2206-lambda-serialization branch June 7, 2024 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants