dynamic fwd proxy: fix race condition in filter destructor #32692
Check was skipped
This check was not triggered in this CI run
Details
Request (pr/32692/main@713cbad)
@cpakulski 713cbad
#32692 merge
main@0bd73e3
dynamic fwd proxy: fix race condition in filter destructor
Commit Message:
dynamic fwd proxy: fix race condition in filter destructorAdditional Description:
Dynamic fwd proxy is specific comparing to other filters because it controls upstream cluster. The filter registers itself (this
pointer) in ClusterManager as cluster callbacks: https://github.com/envoyproxy/envoy/blob/release/v1.29/source/extensions/filters/http/dynamic_forward_proxy/proxy_filter.h#L37.
In the filter's destructor, tls slot destructor will remove the registered callbacks. The problem is that tls slots are not destructed atomically. Instructions to release slots are posted on each thread and main thread continues with filter destructor. This means that there is a short period of time when the filter is gone (memory was released) but callbacks on each worker thread still point to the deleted callbacks. If at that time a worker thread iterates over callbacks, like here: https://github.com/envoyproxy/envoy/blob/release/v1.29/source/common/upstream/cluster_manager_impl.cc#L1314, it will access a callback pointer which has already been released.I observed a crash when I reconfigured a dynamic fwd proxy 2 times per second. The code crashed after ~72 hrs. When I reconfigure the filter 50 timer per second the code crashes within 10 minutes.
The fix is to introduce some order when destructor is called. When callbacks are added to each thread, the added counter is increased. That counter is decremented on each thread when callbacks are deregistered. The destructor waits on condition variable until
this
pointer has been removed from all threads and the counter drops to zero.Risk Level: Low
Testing: Manual. It is difficult to simulate thread switching in unit/integration tests.
Docs Changes: No
Release Notes: No
Platform Specific Features: No
Fixes #32427
Environment
Request variables
Key | Value |
---|---|
ref | 49b10e5f2b49cb9cc2de36302ebd1e549b4d914a |
sha | 713cbad |
pr | 32692 |
base-sha | 0bd73e3 |
actor | @cpakulski |
message | dynamic fwd proxy: fix race condition in filter destructor... |
started | 1709775037.880623 |
target-branch | main |
trusted | false |
Build image
Container image/s (as used in this CI run)
Key | Value |
---|---|
default | envoyproxy/envoy-build-ubuntu:0ca52447572ee105a4730da5e76fe47c9c5a7c64 |
mobile | envoyproxy/envoy-build-ubuntu:mobile-0ca52447572ee105a4730da5e76fe47c9c5a7c64 |
Version
Envoy version (as used in this CI run)
Key | Value |
---|---|
major | 1 |
minor | 30 |
patch | 0 |
dev | true |