-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[thrift][bugfix]try to fix thrift request overflow crash. #12890
Conversation
PoolFailure will set host to null,cause the crush in this case. Signed-off-by: Guang Yang <pyrl247@gmail.com>
Thanks. You’ll need to fix the code formatting (if you click through to the details of the failed build you can eventually drill down to the log file that shows what’s formatted incorrectly, or you can install clang-format and use the tools/code_format/check_format.py directly). Also, let’s add a test to validate the fix. |
Signed-off-by: Guang Yang <pyrl247@gmail.com>
I didn't found where to add the test case,Could you please tell me where to add it ? |
For the test case, I think you can probably use this existing one as an example:
and add something like:
at the end... (which should crash without your fix). I think something along those lines would work. |
AppExceptionType::InternalError, | ||
fmt::format("too many connections to '{}'", upstream_host_->address()->asString())), | ||
AppException(AppExceptionType::InternalError, | ||
fmt::format("thrift upstream request: too many connections")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- Since you've removed the string interpolation, you don't need to call
fmt::format
here. - There is another case below that uses
upstream_host_
below. That should be modified as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the Timeout case? When it hits that case ,as far as I can see the upstream_host_ will keep the real one.Could you please check if it's right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should depend on that behavior even if it's true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is better to report more detail about the Exception.How about change like this
`
parent_.callbacks_->sendLocalReply(
AppException(
AppExceptionType::InternalError,
fmt::format("connection failure '{}'", (upstream_host_ != nullptr)
? upstream_host_->address()->asString()
: "to upstream")),
true);
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
@rgs1 What about I change the test/mocks/tcp/mocks.cc the poolFailure function like this?It will trigger the crash.
|
That probably works too. |
Signed-off-by: Guang Yang <pyrl247@gmail.com>
Signed-off-by: Guang Yang <pyrl247@gmail.com>
@@ -407,7 +407,8 @@ TEST_F(ThriftRouterTest, PoolOverflowFailure) { | |||
EXPECT_THAT(app_ex.what(), ContainsRegex(".*too many connections.*")); | |||
EXPECT_TRUE(end_stream); | |||
})); | |||
context_.cluster_manager_.tcp_conn_pool_.poolFailure(ConnectionPool::PoolFailureReason::Overflow); | |||
context_.cluster_manager_.tcp_conn_pool_.poolFailureWithNullHost( | |||
ConnectionPool::PoolFailureReason::Overflow); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verified that this crashes without your fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes,if i use the previous code ,it will crash in my ci like this .
[ RUN ] ThriftRouterTest.PoolOverflowFailure
TestRandomGenerator running with seed -443243904
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 0/1.16.0-dev/test/RELEASE/BoringSSL
[2020-09-09 13:03:11.969][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #0: __restore_rt [0x7faf6ee3a8a0]
[2020-09-09 13:03:11.975][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Tcp::ConnectionPool::MockInstance::poolFailure() [0x9b3dd3]
[2020-09-09 13:03:11.981][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Extensions::NetworkFilters::ThriftProxy::Router::ThriftRouterTest_PoolOverflowFailure_Test::TestBody() [0x788b44]
[2020-09-09 13:03:11.987][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: testing::internal::HandleExceptionsInMethodIfSupported<>() [0x1293a78]
[2020-09-09 13:03:11.992][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: testing::Test::Run() [0x12939a5]
[2020-09-09 13:03:11.998][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: testing::TestInfo::Run() [0x12947f0]
[2020-09-09 13:03:12.003][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: testing::TestSuite::Run() [0x1295177]
[2020-09-09 13:03:12.009][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: testing::internal::UnitTestImpl::RunAllTests() [0x12a2047]
[2020-09-09 13:03:12.015][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: testing::internal::HandleExceptionsInMethodIfSupported<>() [0x12a1968]
[2020-09-09 13:03:12.020][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: testing::UnitTest::Run() [0x12a17ef]
[2020-09-09 13:03:12.026][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::TestRunner::RunTests() [0xbf1734]
[2020-09-09 13:03:12.031][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: main [0xbf092b]
[2020-09-09 13:03:12.032][5454][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #12: __libc_start_main [0x7faf6ea58b97]
================================================================================
INFO: Elapsed time: 1017.162s, Critical Path: 329.15s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks.
test/mocks/tcp/mocks.h
Outdated
@@ -59,6 +59,7 @@ class MockInstance : public Instance { | |||
|
|||
Envoy::ConnectionPool::MockCancellable* newConnectionImpl(Callbacks& cb); | |||
void poolFailure(PoolFailureReason reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably just add a bool host_null
param and reuse poolFailure()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably want a changelog fix, e.g.:
Signed-off-by: Guang Yang <pyrl247@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks. Just one last nit.
Signed-off-by: Guang Yang <pyrl247@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I went ahead and fix the merge conflict on the docs.
try to fix thrift request overflow crash.When request overflow ,the OnPoolFailure will set host to null,cause the crash in this case.
Signed-off-by: Guang Yang pyrl247@gmail.com
For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md
Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]