Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix null metadata dereference in Thrift proxy router for shadow traffic #38181

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

PeterL328
Copy link
Contributor

@PeterL328 PeterL328 commented Jan 24, 2025

Commit Message:
Addresses a segmentation fault in Envoy’s Thrift proxy when handling shadow traffic. The crash occurs due to a null pointer dereference of MessageMetadata within the shadow router path during response parsing from upstream. In scenarios where the Thrift decoder unexpectedly returns a complete parsing status without metadata, the previous logic led to crashing. Now, UpstreamRequest has been updated to include null-checks around metadata access during handleRegularResponse(), treating it as an error condition.
Added a new Counter to track null metadata from responses for Thrift router
thrift.upstream_resp_metadata_null

Additional Description:
Backtrace from production machine

Program terminated with signal SIGSEGV, Segmentation fault.  
#0  0x00007f4ea0ea32ab in raise () from /lib/x86_64-linux-gnu/libpthread.so.0  
   [Current thread is 1 (Thread 0x7f4e9ac44700 (LWP 64))]  
  
Stack trace:  
  
#0  0x00007f4ea0ea32ab in raise () from /lib/x86_64-linux-gnu/libpthread.so.0  
#1  0x000056039f8dad43 in Envoy::SignalAction::sigHandler(int, siginfo_t*, void*)   
    at external/envoy/source/common/signal/signal_action.cc:53  
  
#2  <signal handler called>  
  
#3  std::_1::optional_storage_base<Envoy::Extensions::NetworkFilters::ThriftProxy::MessageType, false>::has_value() const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/optional:359  
  
#4  std::1::optional<Envoy::Extensions::NetworkFilters::ThriftProxy::MessageType>::value() const &   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/optional:826  
  
#5  Envoy::Extensions::NetworkFilters::ThriftProxy::MessageMetadata::messageType() at external/envoy/source/extensions/filters/network/thrift_proxy/metadata.h:110  
  
#6  Envoy::Extensions::NetworkFilters::ThriftProxy::Router::UpstreamRequest::handleRegularResponse(..)   
    at external/envoy/source/extensions/filters/network/thrift_proxy/router/upstream_request.cc:141  
  
#7  0x000056039ed896fa in Envoy::Extensions::NetworkFilters::ThriftProxy::Router::UpstreamRequest::handleUpstreamData   
    at external/envoy/source/extensions/filters/network/thrift_proxy/router/upstream_request.cc:198  
  
#8  0x000056039ed82ff9 in non-virtual thunk to Envoy::Extensions::NetworkFilters::ThriftProxy::Router::ShadowRouterImpl::onUpstreamData(..)   
    at external/envoy/source/extensions/filters/network/thrift_proxy/router/shadow_writer_impl.cc:288  
  
#9  0x000056039f3403ed in Envoy::Tcp::ActiveTcpClient::onUpstreamData at external/envoy/source/common/tcp/conn_pool.h:124  
  
#10 Envoy::Tcp::ActiveTcpClient::ConnReadFilter::onData at external/envoy/source/common/tcp/conn_pool.h:47  
  
#11 0x000056039f7fb3f5 in Envoy::Network::FilterManagerImpl::onContinueReading(..)   
    at external/envoy/source/common/network/filter_manager_impl.cc:92  
  
#12 0x000056039f7985a5 in Envoy::Network::ConnectionImpl::onRead at external/envoy/source/common/network/connection_impl.cc:384  
  
#13 Envoy::Network::ConnectionImpl::onReadReady at external/envoy/source/common/network/connection_impl.cc:709  
  
#14 0x000056039f7948d7 in Envoy::Network::ConnectionImpl::onFileEvent events=3   
    at external/envoy/source/common/network/connection_impl.cc:648  
  
#15 0x000056039f79e536 in Envoy::Network::ConnectionImpl::ConnectionImpl(..)::$_7::operator()(unsigned int) const   
    at external/envoy/source/common/network/connection_impl.cc:103  
  
#16 std::1::invoke<R, Func, ArgTypes...>(Func&&, ArgTypes&&...) (f=..., __args=<error reading variable>)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/type_traits/invoke.h:344  
  
#17 std::1::invoke_void_return_wrapper<absl::lts_20240722::Status, false>::call(..)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/type_traits/invoke.h:411  
  
#18 std::1::function::alloc_func<..>::operator()(unsigned int&&)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:169  
  
#19 std::1::function::func<..>::operator()(unsigned int&&)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:311  
  
#20 0x000056039f785086 in std::1::function::value_func<..>::operator()(unsigned int&&) const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:428  
  
#21 std::1::function<..>::operator()(unsigned int) const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:981  
  
#22 Envoy::Event::DispatcherImpl::createFileEvent(..)::$_0::operator()(unsigned int) const   
    at external/envoy/source/common/event/dispatcher_impl.cc:186  
  
#23 std::1::invoke<R, Func, ArgTypes...>(Func&&, ArgTypes&&...)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/type_traits/invoke.h:344  
  
#24 std::1::invoke_void_return_wrapper<absl::lts_20240722::Status, false>::call(..)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/type_traits/invoke.h:411  
  
#25 std::1::function::alloc_func<..>::operator()(unsigned int&&)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:169  
  
#26 std::1::function::func<..>::operator()(unsigned int&&)   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:311  
  
#27 0x000056039f786155 in std::1::function::value_func<..>::operator()(unsigned int&&) const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:428  
  
#28 std::1::function<..>::operator()(unsigned int) const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:981  
  
#29 Envoy::Event::FileEventImpl::mergeInjectedEventsAndRunCb(int)   
    at external/envoy/source/common/event/file_event_impl.cc:161  
  
#30 0x000056039fa02503 in event_process_active_single_queue ()  
#31 0x000056039fa010f1 in event_base_loop ()  
#32 0x000056039ef71d71 in Envoy::Server::WorkerImpl::threadRoutine(..)   
    at external/envoy/source/server/worker_impl.cc:156  
  
#33 0x000056039fa83e7f in std::1::function<void()>::operator()() const   
    at __bazel_toolchain_llvm_repo/bin/../include/c++/v1/functional/function.h:981  
  
#34 Envoy::Thread::PosixThreadFactory::createPthread(Envoy::Thread::ThreadHandle*)::$_0::operator()(void*) const   
    at external/envoy/source/common/common/posix/thread_impl.cc:178  
  
#35 Envoy::Thread::PosixThreadFactory::createPthread(Envoy::Thread::ThreadHandle*)::$_0::_invoke(void*)   
    at external/envoy/source/common/common/posix/thread_impl.cc:173  
  
#36 0x00007f4ea0e97609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0  
#37 0x00007f4ea0dac133 in clone () from /lib/x86_64-linux-gnu/libc.so.6 

Risk Level: Medium
Testing: new unit test
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #38181 was opened by PeterL328.

see: more, trace.

@PeterL328 PeterL328 force-pushed the yleng/fix_crash_with_thrift_proxy_null_response_metadata_for_shadow branch from 432baf3 to 9e8893e Compare January 27, 2025 23:47
@PeterL328 PeterL328 marked this pull request as ready for review January 27, 2025 23:47
@PeterL328 PeterL328 requested a review from zuercher as a code owner January 27, 2025 23:47
This commit addresses a segmentation fault in Envoy’s Thrift proxy when handling shadow traffic. The crash occurs due to a null pointer dereference of MessageMetadata within the shadow router path during response parsing from upstream.
In scenarios where the Thrift decoder unexpectedly returns a complete parsing status without metadata, the previous logic led to crashing. Now, UpstreamRequest has been updated to include null-checks around metadata access during handleRegularResponse(), treating it as an error condition.

Signed-off-by: Peter Leng <Peter Leng>
Signed-off-by: Peter Leng <Peter Leng>
Signed-off-by: Peter Leng <Peter Leng>
@PeterL328 PeterL328 force-pushed the yleng/fix_crash_with_thrift_proxy_null_response_metadata_for_shadow branch from 9e8893e to 0f814ba Compare January 27, 2025 23:57
@PeterL328
Copy link
Contributor Author

PeterL328 commented Jan 28, 2025

@zuercher Hi Stephan,
can we run the skipped tests in the CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant