-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: Capture all the strings used to accumulate http stats in an object, plumbed through the system. #4997
Changes from 6 commits
d70cb18
9c49dec
df7d7a1
da648b3
ec66aaf
d509154
cf7565d
9cd564e
255c51b
cca3bc9
bdea1d7
e7d59b2
60d7454
eb0be50
60c4a73
4edb9cd
b8fc8da
228c52d
c257d69
2cc92ec
aea8630
30e0c15
47e014e
21e5a34
3a2cef0
dcf800b
72e6dd3
522e235
01bd77a
7802653
c3508fa
0c8ac51
4b32611
4f0ebb5
d08fc60
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -213,6 +213,11 @@ class Instance { | |
* @return the flush interval of stats sinks. | ||
*/ | ||
virtual std::chrono::milliseconds statsFlushInterval() const PURE; | ||
|
||
/** | ||
* @return Http::CodeStats the http response-code stats | ||
*/ | ||
virtual Http::CodeStats& codeStats() PURE; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not crazy about this being a server-wide thing and then having to be plumbed through everywhere. What's the reasoning for that? Can it potentially be per-HCM and then just exposed via HTTP filter callbacks? If you want it to be global can we create it in the singleton manager as part of the HCM (like the HTTP date provider)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. HCM==HttpConnectionManager I assume? How many HCMs are there? One per thread? The reasoning for having it be global is that it's not going to be free to create or store, once SymbolTables are integrated. But if there were a bounded number created at startup it would be OK. There's no compelling reason there has to be only one, as long as they are not created on demand in the hot-path. What are the drawbacks having them be plumbed everywhere? In my mind the most painful thing is the number of many edits that required. However, I'd be up for a refactor -- even in advance of this PR. One thought I had was making Server::Configuration::FactoryContext more broadly available, and that would make TimeSource, Http::CodeStats, SymbolTable, and whatever else was needed easier to get. I hadn't really looked at HCM before but it looks like this is a way to pass around data that's derived from configuration, whereas Server::Configuration::FactoryContext looks more like it's sharing common data structures created by the server at startup, which feels like what this is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Because IMO it's an abstraction leakage. Envoy is not a dedicated HTTP server (though yes we always have the admin server) so it seems incorrect to create this thing and pass it around everywhere when it only applies to HTTP traffic. IMO we can create one of these things per HCM (it's not per-thread, it's per listener effectively) or if we want one for all listeners you can create a singleton like here: https://github.com/envoyproxy/envoy/blob/master/source/extensions/filters/network/http_connection_manager/config.cc#L66 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough -- individual HTTP-specific objects shouldn't be plumbed through where they are not needed. And using a singleton owned by a per-connection HTTP factory seems workable and I'll go with that for this PR. IIRC you told me once that pattern of singleton-management gets cleaned up between unit tests. But it does seem like there's a few different http-specific objects that want to be once per server...what do you think of having a class to hold and/or create those (HttpContext or HttpFactory or similar), and have that passed into the HCM constructor? This would effectively be like the singletons but a little cleaner from a statics perspective. WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The singleton manager avoids statics, it's an actual object. Take a look at the implementation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you're going to be much happier with the change I just made; I was trying to move stuff around for a cleaner dependency graph but I think it was better before I did this. I just did a commit to see how it looks via the GH interface. I'll follow up further. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per DM on slack I am adding a new Http::Context object to hold the httpTracer and Http::CodeStats. The singleton-manager didn't work out due to same-thread assertions. |
||
}; | ||
|
||
} // namespace Server | ||
|
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -7,124 +7,145 @@ | |||
#include "envoy/stats/scope.h" | ||||
|
||||
#include "common/common/enum_to_int.h" | ||||
#include "common/common/fmt.h" | ||||
#include "common/common/utility.h" | ||||
#include "common/http/headers.h" | ||||
#include "common/http/utility.h" | ||||
|
||||
#include "absl/strings/match.h" | ||||
#include "absl/strings/str_cat.h" | ||||
#include "absl/strings/str_join.h" | ||||
|
||||
namespace Envoy { | ||||
namespace Http { | ||||
|
||||
void CodeUtility::chargeBasicResponseStat(Stats::Scope& scope, const std::string& prefix, | ||||
Code response_code) { | ||||
CodeStatsImpl::CodeStatsImpl() {} | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: del empty constuctor/destructor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on past experience I think we can improve the compile/link times for Envoy if we have explicit non-inlined destructors for all virtual classes. But injecting them incrementally in a few PRs without broader evangelism isn't going to move the needle. The Chromium style guide has similar observations: IMO the destructors are more important to leave outlined than the constructors, though, as I think the constructors will only cost you compile/link time when they are actually instantiated, whereas inlined destructors cost you compile-link speed every time they are referenced. For now though I'll just inline them to match current style. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added #5182 for discussing inline dtors. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fine, I don't really care as long as we are generally consistent. We already do this for mocks for the same reason. I will stop commenting on this. |
||||
|
||||
CodeStatsImpl::~CodeStatsImpl() {} | ||||
|
||||
void CodeStatsImpl::chargeBasicResponseStat(Stats::Scope& scope, const std::string& prefix, | ||||
Code response_code) { | ||||
// Build a dynamic stat for the response code and increment it. | ||||
scope.counter(fmt::format("{}upstream_rq_completed", prefix)).inc(); | ||||
scope.counter(fmt::format("{}upstream_rq_{}", prefix, groupStringForResponseCode(response_code))) | ||||
scope.counter(absl::StrCat(prefix, upstream_rq_completed_)).inc(); | ||||
scope | ||||
.counter(absl::StrCat(prefix, upstream_rq_, | ||||
CodeUtility::groupStringForResponseCode(response_code))) | ||||
.inc(); | ||||
scope.counter(fmt::format("{}upstream_rq_{}", prefix, enumToInt(response_code))).inc(); | ||||
scope.counter(absl::StrCat(prefix, upstream_rq_, enumToInt(response_code))).inc(); | ||||
} | ||||
|
||||
void CodeUtility::chargeResponseStat(const ResponseStatInfo& info) { | ||||
void CodeStatsImpl::chargeResponseStat(const ResponseStatInfo& info) { | ||||
const uint64_t response_code = info.response_status_code_; | ||||
chargeBasicResponseStat(info.cluster_scope_, info.prefix_, static_cast<Code>(response_code)); | ||||
|
||||
std::string group_string = groupStringForResponseCode(static_cast<Code>(response_code)); | ||||
std::string group_string = | ||||
CodeUtility::groupStringForResponseCode(static_cast<Code>(response_code)); | ||||
|
||||
// If the response is from a canary, also create canary stats. | ||||
if (info.upstream_canary_) { | ||||
info.cluster_scope_.counter(fmt::format("{}canary.upstream_rq_completed", info.prefix_)).inc(); | ||||
info.cluster_scope_.counter(fmt::format("{}canary.upstream_rq_{}", info.prefix_, group_string)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, canary_upstream_rq_completed_)).inc(); | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, canary_upstream_rq_, group_string)) | ||||
.inc(); | ||||
info.cluster_scope_.counter(fmt::format("{}canary.upstream_rq_{}", info.prefix_, response_code)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, canary_upstream_rq_, response_code)) | ||||
.inc(); | ||||
} | ||||
|
||||
// Split stats into external vs. internal. | ||||
if (info.internal_request_) { | ||||
info.cluster_scope_.counter(fmt::format("{}internal.upstream_rq_completed", info.prefix_)) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}internal.upstream_rq_{}", info.prefix_, group_string)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, internal_upstream_rq_completed_)).inc(); | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, internal_upstream_rq_, group_string)) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}internal.upstream_rq_{}", info.prefix_, response_code)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, internal_upstream_rq_, response_code)) | ||||
.inc(); | ||||
} else { | ||||
info.cluster_scope_.counter(fmt::format("{}external.upstream_rq_completed", info.prefix_)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, external_upstream_rq_completed_)).inc(); | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, external_upstream_rq_, group_string)) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}external.upstream_rq_{}", info.prefix_, group_string)) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}external.upstream_rq_{}", info.prefix_, response_code)) | ||||
info.cluster_scope_.counter(absl::StrCat(info.prefix_, external_upstream_rq_, response_code)) | ||||
.inc(); | ||||
} | ||||
|
||||
// Handle request virtual cluster. | ||||
if (!info.request_vcluster_name_.empty()) { | ||||
info.global_scope_ | ||||
.counter(fmt::format("vhost.{}.vcluster.{}.upstream_rq_completed", info.request_vhost_name_, | ||||
info.request_vcluster_name_)) | ||||
.counter(join({vhost_, info.request_vhost_name_, vcluster_, info.request_vcluster_name_, | ||||
upstream_rq_completed_})) | ||||
.inc(); | ||||
info.global_scope_ | ||||
.counter(fmt::format("vhost.{}.vcluster.{}.upstream_rq_{}", info.request_vhost_name_, | ||||
info.request_vcluster_name_, group_string)) | ||||
.counter(join({vhost_, info.request_vhost_name_, vcluster_, info.request_vcluster_name_, | ||||
absl::StrCat(upstream_rq_, group_string)})) | ||||
.inc(); | ||||
info.global_scope_ | ||||
.counter(fmt::format("vhost.{}.vcluster.{}.upstream_rq_{}", info.request_vhost_name_, | ||||
info.request_vcluster_name_, response_code)) | ||||
.counter(join({vhost_, info.request_vhost_name_, vcluster_, info.request_vcluster_name_, | ||||
absl::StrCat(upstream_rq_, response_code)})) | ||||
.inc(); | ||||
} | ||||
|
||||
// Handle per zone stats. | ||||
if (!info.from_zone_.empty() && !info.to_zone_.empty()) { | ||||
absl::string_view prefix_without_trailing_dot = stripTrailingDot(info.prefix_); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}zone.{}.{}.upstream_rq_completed", info.prefix_, info.from_zone_, | ||||
info.to_zone_)) | ||||
.counter(join({prefix_without_trailing_dot, zone_, info.from_zone_, info.to_zone_, | ||||
upstream_rq_completed_})) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}zone.{}.{}.upstream_rq_{}", info.prefix_, info.from_zone_, | ||||
info.to_zone_, group_string)) | ||||
.counter(join({prefix_without_trailing_dot, zone_, info.from_zone_, info.to_zone_, | ||||
absl::StrCat(upstream_rq_, group_string)})) | ||||
.inc(); | ||||
info.cluster_scope_ | ||||
.counter(fmt::format("{}zone.{}.{}.upstream_rq_{}", info.prefix_, info.from_zone_, | ||||
info.to_zone_, response_code)) | ||||
.counter(join({prefix_without_trailing_dot, zone_, info.from_zone_, info.to_zone_, | ||||
absl::StrCat(upstream_rq_, response_code)})) | ||||
.inc(); | ||||
} | ||||
} | ||||
|
||||
void CodeUtility::chargeResponseTiming(const ResponseTimingInfo& info) { | ||||
info.cluster_scope_.histogram(info.prefix_ + "upstream_rq_time") | ||||
void CodeStatsImpl::chargeResponseTiming(const ResponseTimingInfo& info) { | ||||
info.cluster_scope_.histogram(absl::StrCat(info.prefix_, upstream_rq_time_)) | ||||
.recordValue(info.response_time_.count()); | ||||
if (info.upstream_canary_) { | ||||
info.cluster_scope_.histogram(info.prefix_ + "canary.upstream_rq_time") | ||||
info.cluster_scope_.histogram(absl::StrCat(info.prefix_, canary_upstream_rq_time_)) | ||||
.recordValue(info.response_time_.count()); | ||||
} | ||||
|
||||
if (info.internal_request_) { | ||||
info.cluster_scope_.histogram(info.prefix_ + "internal.upstream_rq_time") | ||||
info.cluster_scope_.histogram(absl::StrCat(info.prefix_, internal_upstream_rq_time_)) | ||||
.recordValue(info.response_time_.count()); | ||||
} else { | ||||
info.cluster_scope_.histogram(info.prefix_ + "external.upstream_rq_time") | ||||
info.cluster_scope_.histogram(absl::StrCat(info.prefix_, external_upstream_rq_time_)) | ||||
.recordValue(info.response_time_.count()); | ||||
} | ||||
|
||||
if (!info.request_vcluster_name_.empty()) { | ||||
info.global_scope_ | ||||
.histogram("vhost." + info.request_vhost_name_ + ".vcluster." + | ||||
info.request_vcluster_name_ + ".upstream_rq_time") | ||||
.histogram(join({vhost_, info.request_vhost_name_, vcluster_, info.request_vcluster_name_, | ||||
upstream_rq_time_})) | ||||
.recordValue(info.response_time_.count()); | ||||
} | ||||
|
||||
// Handle per zone stats. | ||||
if (!info.from_zone_.empty() && !info.to_zone_.empty()) { | ||||
info.cluster_scope_ | ||||
.histogram(fmt::format("{}zone.{}.{}.upstream_rq_time", info.prefix_, info.from_zone_, | ||||
info.to_zone_)) | ||||
.histogram(join({stripTrailingDot(info.prefix_), zone_, info.from_zone_, info.to_zone_, | ||||
upstream_rq_time_})) | ||||
.recordValue(info.response_time_.count()); | ||||
} | ||||
} | ||||
|
||||
absl::string_view CodeStatsImpl::stripTrailingDot(absl::string_view str) { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. WDYT about moving this to StringUtil with trailing char as argument? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought of that...but I'm not sure if this would be useful anywhere else, so I thought maybe it doesn't make that much sense to broaden the interface everyone has to look through. If you think there might be another use-case I can broaden it. And actually the only reason to have this function to to avoid removing all the cases of "prefix." throughout the code in this PR. I think instead I'll put a TODO to remove the function and the need for it. |
||||
if (absl::EndsWith(str, ".")) { | ||||
str.remove_suffix(1); | ||||
} | ||||
return str; | ||||
} | ||||
|
||||
std::string CodeStatsImpl::join(const std::vector<absl::string_view>& v) { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you probably use this function here? envoy/source/common/common/utility.h Line 309 in 9ba0a43
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No...this one is special; it skips the first token if it's empty. I could name it something different but I couldn't think of anything really brief. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. Got it. |
||||
ASSERT(!v.empty()); | ||||
auto iter = v.begin(); | ||||
if (iter->empty()) { | ||||
++iter; // Skip any initial empty prefix. | ||||
} | ||||
return absl::StrJoin(iter, v.end(), "."); | ||||
} | ||||
|
||||
std::string CodeUtility::groupStringForResponseCode(Code response_code) { | ||||
if (CodeUtility::is2xx(enumToInt(response_code))) { | ||||
return "2xx"; | ||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra newline