-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: Create stats for http codes with the symbol table. #6733
Conversation
…or stats. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…age objects, holding only one reference to the symbol table but maintaining RAII. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…tp::CodeStatsImpl. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…hink it doesn't add clarity or useful abstraction. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ructs. Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz holding off assigning this until it's a not a WIP any more /wait |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
/wait I am actually homing in on getting this master-merged & working again. |
…t-names for http responses. Not compiling yet, but pushing anyway. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
… it. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Comments are minor.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ahedberg @mattklein123 can you PTAL ?
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is very cool stuff. Meta comment: Do you think the risk here is low? It seems more like medium? Some of this is a little scary.
I have a bunch of random comments which I'm flushing out. I didn't look at tests yet. Thank you!
/wait
source/common/router/router.cc
Outdated
alt_stat_prefix_ = std::string(request_alt_name->value().getStringView()) + "."; | ||
// NOTE: converting this header value into a StatName requires taking a | ||
// global symbol-table lock. Is this scenario common enough to raise | ||
// concern? Or is this effectively a test hook? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a test feature. People do use this in production, but I'm not sure how often it's actually used. I think it's probably fine but I would maybe put in a TODO about thinking about improving this in the future and maybe put a warning in the docs on the documentation for this header? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Found the doc:
./docs/root/configuration/http_filters/router_filter.rst
./docs/root/configuration/http_conn_man/header_sanitizing.rst
Since we are still using FakeSymbolTable this will not take a lock yet, so this PR doesn't introduce a lock in the hot-path.
Maybe I should add a tls-cache for cases where we really need to compose new StatName from string in the hotpath. I'm thinking there may be a few cases like this, which would justify adding this capability, rather than having to explain lock overhead details in user doc. Since we are still using Fakes I'd prefer to do that in a separate PR.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think let's just put in a TODO for now and not add any docs. I don't think very many people are using this feature and I doubt they would notice the lock overhead even if we added it.
With that said, although I'm pretty certain this change is going to make this change faster, would it be worth it to add a microbenchmark for this code and then we can see the perf change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have a microbenchmark, and I've been running it on this in various forms: test/common/http/codes_speed_test.cc. This makes things faster with real symbol tables, but it's a wash with fake symbol tables (since that really just degenerates to the same string concatenations in the hot-path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a draft PR which provides a mechanism to fix this, but I don't know if it's worth the complexity for this feature. #7008
My feeling is that I'll leave that in draft mode until I see if there's more need for this beyond the header-controlled stat-prefix.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…of them is instantiated. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks looks good modulo remaining comments. WDYT about having a microbenchmark here first per my other comment?
/wait
source/common/router/router.cc
Outdated
alt_stat_prefix_ = std::string(request_alt_name->value().getStringView()) + "."; | ||
// NOTE: converting this header value into a StatName requires taking a | ||
// global symbol-table lock. Is this scenario common enough to raise | ||
// concern? Or is this effectively a test hook? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have a microbenchmark, and I've been running it on this in various forms: test/common/http/codes_speed_test.cc. This makes things faster with real symbol tables, but it's a wash with fake symbol tables (since that really just degenerates to the same string concatenations in the hot-path).
…tus codes. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM. Few small comments and general I would love to constify all the StatName variables where possible.
/wait.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Thanks for the review of this deep complex PR @mattklein123 |
* master: (65 commits) proto: Add PATCH method to RequestMethod enum (envoyproxy#6737) exe: drop unused deps on zlib compressor code (envoyproxy#7022) coverage: fix some misc coverage (envoyproxy#7033) Enable proto schema for router_check_tool (envoyproxy#6992) stats: rework stat sink flushing to centralize counter latching (envoyproxy#6996) [test] convert lds api test config stubs to v2 (envoyproxy#7021) router: scoped rds (2c): implement scoped rds API (envoyproxy#6932) build: Add option for size-optimized binary (envoyproxy#6960) test: adding an integration test framework for file-based LDS (envoyproxy#6933) doc: update obsolete ref to api/XDS_PROTOCOL.md (envoyproxy#7002) dispatcher: faster runOnAllThreads (envoyproxy#7011) example: add csrf sandbox (envoyproxy#6805) fix syntax of gcov exclusion zone. (envoyproxy#7023) /runtime_modify: add support for query params in body (envoyproxy#6977) stats: Create stats for http codes with the symbol table. (envoyproxy#6733) health check: fix more fallout from inline deletion change (envoyproxy#6988) Max heap fix (envoyproxy#7016) Add support to unregister from lifecycle notifications (envoyproxy#6984) build spdy_core_alt_svc_wire_format (envoyproxy#7010) ext_authz: Make sure initiateCall only called once (envoyproxy#6949) ... Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Description: The HTTP response stats are dynamically generated in response to HTTP response codes from upstream, combined with cluster and vhost names. This changes makes that all occur using the SymbolTable API using a pattern that avoids global locks whenever an HTTP response code is re-used.
Note that there is even in a non-symbol table world, a global lock was always taken whenever a new http response stat had to be created, but there was none taken when a response was re-used. This brings the symbol-table API usage into that same state: no global locks for using a stat after it's been created.
Note also that as of now, the FakeSymbolTableImpl in use does not require any global locks at all, but this change is a pre-req for switching to the real SymbolTableImpl without taking a performance hit.
Risk Level: medium
Testing: //test/...
Docs Changes: n/a
Release Notes: n/a