Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logger: use server stats scope at grpc logger #18067

Merged
merged 11 commits into from
Oct 1, 2021

Conversation

lambdai
Copy link
Contributor

@lambdai lambdai commented Sep 10, 2021

Commit Message:
Fix the grpc access logger counter raised illegal memory access
by using server scope in shared loggers

Wider stats violation and detection can be found in #18047

Signed-off-by: Yuchen Dai silentdai@gmail.com

Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
fix #18066
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@zuercher
Copy link
Member

/assign-from @envoyproxy/first-pass-reviewers

@repokitteh-read-only
Copy link

@envoyproxy/first-pass-reviewers assignee is @rojkov

🐱

Caused by: a #18067 (comment) was created by @zuercher.

see: more, trace.

@mathetake
Copy link
Member

JFYI #18081 this is somewhat related to this:)

Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good. I've added one nit.

/cc @pradeepcrao might add comments.

source/extensions/access_loggers/grpc/config_utils.cc Outdated Show resolved Hide resolved
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 13, 2021

Thank you @rojkov

@lambdai
Copy link
Contributor Author

lambdai commented Sep 13, 2021

@mattklein123 @ggreenway Could you take a quick look so that we can decide the next step?

My point is this PR fix the crash described in #18066 and another crash pattern combining the efforts in #18081

The wanted stats can be added in the follow up PRs

rojkov
rojkov previously approved these changes Sep 14, 2021
Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM given the merge conflict is resolved.

Copy link
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait

source/extensions/access_loggers/grpc/config_utils.cc Outdated Show resolved Hide resolved
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 15, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18067 (comment) was created by @lambdai.

see: more, trace.

@lambdai lambdai requested a review from yanavlasov September 16, 2021 18:10
@lambdai
Copy link
Contributor Author

lambdai commented Sep 16, 2021

gentle ping

rojkov
rojkov previously approved these changes Sep 17, 2021
Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This feels less hacky indeed.

@jmarantz
Copy link
Contributor

another main-merge is needed.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 17, 2021

another main-merge is needed.

Thanks! Fixing and running local CI.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 23, 2021

Is it worth adding an integration test that would crash/fail without this fix?
Change looks good overall.
/wait-any

Thanks! The scope arg is removed from the factory method. The code that crashes in the past is not compilable now.

What I was thinking was an integration test that uses a grpc logger, then reloads the config, and does more logging, to trigger the case that would have crashed before this fix.

ack. Working on it

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai lambdai requested a review from ggreenway September 23, 2021 22:16
Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to improve in the new integration test, then I think this is ready to merge.

@@ -597,6 +597,59 @@ TEST_P(LdsIntegrationTest, NewListenerWithBadPostListenSocketOption) {
test_server_->waitForCounterGe("listener_manager.listener_create_failure", 1);
}

// Verify the grpc cached logger is available after the initial logger filter is destroyed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on this and describe the stat scope lifetime issue this is testing for

test/integration/xds_integration_test.cc Outdated Show resolved Hide resolved
test/integration/xds_integration_test.cc Outdated Show resolved Hide resolved
@ggreenway
Copy link
Contributor

/wait

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 28, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18067 (comment) was created by @lambdai.

see: more, trace.

Copy link
Contributor

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Contributor Author

lambdai commented Sep 29, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18067 (comment) was created by @lambdai.

see: more, trace.

@lambdai
Copy link
Contributor Author

lambdai commented Sep 29, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18067 (comment) was created by @lambdai.

see: more, trace.

@lambdai
Copy link
Contributor Author

lambdai commented Sep 30, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18067 (comment) was created by @lambdai.

see: more, trace.

Copy link
Member

@rojkov rojkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

/assign @yanavlasov for final approval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stats scope for grpc access log should be envoy wide
7 participants