-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash regression from #10362 #10901
Comments
Maybe #10362 :
cc: @yanavlasov |
@rgs1 if possible can you get a proper stack trace with debug symbols? |
Reverting #10362 indeed made it go away. Hmm. I might not have time to debug more of this today, so a better stack trace might need to wait unless someone else can get a repro of this first (if not I'll follow-up later/tomorrow). |
OK we should probably revert #10362 in the meantime. cc @yanavlasov |
Is the plan still to revert this? I might have time to look into this again later today, but in parallel we'd like to still stay close to master (without carrying a revert patch) so reverting would be nice. |
@rgs1 can you file the revert PR? If not I can do it later. |
Yup -- coming up. |
@mattklein123 revert here: #10919. |
@rgs1 stack trace would be very helpful or some pointers on how to reproduce this. |
For others following along:
That's from a clean start, not a hot-restart. |
I was able to recreate it. The conditions are:
This leads to some invariant of the new split initialization of the cluster manager violated. It is firing an ASSERT in cluster manager in debug builds. In release builds it continues and later crashes with the stack that @rgs1 attached, which I think is red herring at this point. I will debug this further later on to figure out how the state machine in cluster manager gets out of whack. |
Fixed |
We synced with master yesterday and are now getting a crash on startup (after a hot restart).
Here's the stack trace (not super useful):
I am suspecting either #10362 or #10842.
The full list of changes that we picked up:
Remove hardcoded type urls Part.2 #10848
upstream: fix panic on grpc unknown_service status on healthchecks #10863
Fix Windows compilation of test sources #10822
conn_pool: unifying status codes #10854
Windows compilation: enable compiling expanded list of extensions in envoy-static #10542
logger: Make log prefix configurable #10693
stream_info: Collapse constructors #10691
coverage: revert workarounds that are no longer neccessary #10837
Update LuaJIT patch - remove MAP_32BIT #10867
filter: postgres statistics network filter #10642
api/faq: add initial API versioning FAQ entries. #10829
Catch exception and return false in cases where std::regex_match throws. #10861
redis: Fix stack-use-after-scope in test #10840
http: downstream connect support #10720
init: order dynamic resource initialization to make RTDS always be first #10362
[test] fix fuzz tests that might crash on duplicate settings params #10779
Fix clang-tidy in source/common/http/conn_manager_config.h #10860
fix: upstream grpc stats on trailers only #10842
ip tagging: remember tags as builtins #10856
Remove vendor specific dynamo filter use from HCM config test #10858
router: allow retry of streaming/incomplete requests #10725
Update filter_chain_benchmark_test.cc #10850
[admin] extract stats handlers to separate file #10750
The text was updated successfully, but these errors were encountered: