Add a new stat metric in queue to prevent double counting #3477

hohaichi · 2019-03-21T00:30:51Z

Currently a request pending in both the activator and queue proxy is counted twice, once in the activator, and one in the queue proxy. This change fixes it by letting queue report a concurrency metric for proxied requests and letting the autoscaler discount such concurrency when it calculates the total concurrency for scaling decision.

Fixes #3301

Proposed Changes

Add a new Knative header for tracking where a request goes through. And let the activator use it to mark proxied requests.
Let queue report a new metric for the average concurrency of requests proxied through activator.
Autoscaler discount the average proxied concurrency from the average concurrency from queue when it calculate the total concurrency for scaling decision.

Release Note

knative-prow-robot

@hohaichi: 0 warnings.

In response to this:

Currently a request pending in both the activator and queue proxy is counted twice, once in the activator, and one in the queue proxy. This change fixes it by letting queue report a concurrency metric for proxied requests and letting the autoscaler discount such concurrency when it calculates the total concurrency for scaling decision.

Fixes #3301

Proposed Changes

Add a new Knative header for tracking where a request goes through. And let the activator use it to mark proxied requests.

Let queue report a new metric for the average concurrency of requests proxied through activator.

Autoscaler discount the average proxied concurrency from the average concurrency from queue when it calculate the total concurrency for scaling decision.

Release Note

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vagababov

In general looks fine.
But it still has the same attack vector that the underlying issue talks about.
Perhaps the header should be dynamic?

cmd/queue/main.go

pkg/network/network.go

markusthoemmes

Very neat execution, thanks! Left a few comments throughout.

cmd/queue/main.go

pkg/queue/stats.go

hohaichi · 2019-03-21T21:20:34Z

In general looks fine.
But it still has the same attack vector that the underlying issue talks about.
Perhaps the header should be dynamic?
I think an attacker can attack the same way using 'k-network-probe' header. It's probably not a real thread, or we should have the same solution for both.

Currently a request pending in both the activator and queue proxy is counted twice, once in the activator, and one in the queue proxy. This change fixes it by letting queue report a concurrency metric for proxied requests and letting the autoscaler discount such concurrency when it calculates the total concurrency for scaling decision.

hohaichi · 2019-03-25T20:04:28Z

Updated with tests and rebased.

hohaichi · 2019-03-25T20:07:18Z

/assign @mdemirhan
/assign @markusthoemmes

cmd/queue/main.go

cmd/queue/main_test.go

pkg/activator/handler/handler_test.go

vagababov

A few nits.
Rest is fine with me.

cmd/queue/main_test.go

pkg/activator/handler/handler_test.go

markusthoemmes

Generally LGTM but a few comments on naming and adding comments describing the values.

pkg/autoscaler/autoscaler.go

pkg/queue/stats_reporter.go

pkg/queue/stats.go

hohaichi · 2019-03-26T16:38:17Z

/assign @yanweiguo

pkg/autoscaler/autoscaler_test.go

pkg/autoscaler/autoscaler.go

pkg/queue/stats.go

knative-metrics-robot · 2019-03-26T17:38:58Z

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/autoscaler/autoscaler.go	97.2%	97.0%	-0.2
pkg/autoscaler/stats_scraper.go	83.7%	90.3%	6.6
pkg/queue/stats_reporter.go	87.8%	87.5%	-0.3

yanweiguo · 2019-03-26T17:49:35Z

/lgtm

markusthoemmes

/approve
/lgtm

Thanks for all the adjustments 🎉

knative-prow-robot · 2019-03-26T17:52:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hohaichi, markusthoemmes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cmd/queue/OWNERS~~ [markusthoemmes]
~~pkg/activator/OWNERS~~ [markusthoemmes]
~~pkg/autoscaler/OWNERS~~ [markusthoemmes]
~~pkg/network/OWNERS~~ [markusthoemmes]
~~pkg/queue/OWNERS~~ [markusthoemmes]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hohaichi · 2019-03-26T19:20:10Z

/retest

knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/autoscale area/networking labels Mar 21, 2019

knative-prow-robot reviewed Mar 21, 2019

View reviewed changes

knative-prow-robot requested review from josephburnett, lichuqiang, markusthoemmes, mdemirhan and tcnghia March 21, 2019 00:31

vagababov reviewed Mar 21, 2019

View reviewed changes

cmd/queue/main.go Outdated Show resolved Hide resolved

pkg/network/network.go Outdated Show resolved Hide resolved

markusthoemmes reviewed Mar 21, 2019

View reviewed changes

cmd/queue/main.go Outdated Show resolved Hide resolved

cmd/queue/main.go Outdated Show resolved Hide resolved

pkg/queue/stats.go Show resolved Hide resolved

pkg/queue/stats.go Outdated Show resolved Hide resolved

pkg/queue/stats.go Outdated Show resolved Hide resolved

hohaichi force-pushed the i3301 branch from f996ff0 to 0311790 Compare March 25, 2019 19:58

knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 25, 2019

mattmoor-sockpuppet reviewed Mar 25, 2019

View reviewed changes

hohaichi force-pushed the i3301 branch from b96d364 to d0282d6 Compare March 25, 2019 20:02

hohaichi changed the title ~~[WIP] Add a new stat metric in queue to prevent double counting~~ Add a new stat metric in queue to prevent double counting Mar 25, 2019

knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 25, 2019

knative-prow-robot assigned markusthoemmes and mdemirhan Mar 25, 2019

Update test to stay compatible with a recent change.

355d2e5

vagababov reviewed Mar 25, 2019

View reviewed changes

cmd/queue/main.go Outdated Show resolved Hide resolved

cmd/queue/main_test.go Show resolved Hide resolved

cmd/queue/main_test.go Outdated Show resolved Hide resolved

pkg/activator/handler/handler_test.go Outdated Show resolved Hide resolved

Incorporate feedback.

eb0a420

vagababov reviewed Mar 25, 2019

View reviewed changes

cmd/queue/main_test.go Outdated Show resolved Hide resolved

pkg/activator/handler/handler_test.go Outdated Show resolved Hide resolved

markusthoemmes reviewed Mar 26, 2019

View reviewed changes

pkg/autoscaler/autoscaler.go Outdated Show resolved Hide resolved

pkg/queue/stats_reporter.go Outdated Show resolved Hide resolved

pkg/queue/stats.go Outdated Show resolved Hide resolved

Minor changes to address feedback on naming, logging, and documentation.

136f133

knative-prow-robot assigned yanweiguo Mar 26, 2019

yanweiguo reviewed Mar 26, 2019

View reviewed changes

pkg/autoscaler/autoscaler_test.go Outdated Show resolved Hide resolved

yanweiguo reviewed Mar 26, 2019

View reviewed changes

pkg/autoscaler/autoscaler.go Show resolved Hide resolved

pkg/queue/stats.go Outdated Show resolved Hide resolved

Improved readability with comments and a little refactoring.

5937879

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2019

markusthoemmes approved these changes Mar 26, 2019

View reviewed changes

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2019

mattmoor added this to the Serving 0.5 milestone Mar 26, 2019

knative-prow-robot merged commit 969afd8 into knative:master Mar 26, 2019

hohaichi deleted the i3301 branch March 26, 2019 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new stat metric in queue to prevent double counting #3477

Add a new stat metric in queue to prevent double counting #3477

hohaichi commented Mar 21, 2019

knative-prow-robot left a comment

vagababov left a comment

markusthoemmes left a comment

hohaichi commented Mar 21, 2019

hohaichi commented Mar 25, 2019

hohaichi commented Mar 25, 2019

vagababov left a comment

markusthoemmes left a comment

hohaichi commented Mar 26, 2019

knative-metrics-robot commented Mar 26, 2019

yanweiguo commented Mar 26, 2019

markusthoemmes left a comment

knative-prow-robot commented Mar 26, 2019

hohaichi commented Mar 26, 2019

Add a new stat metric in queue to prevent double counting #3477

Add a new stat metric in queue to prevent double counting #3477

Conversation

hohaichi commented Mar 21, 2019

Proposed Changes

knative-prow-robot left a comment

Choose a reason for hiding this comment

Proposed Changes

vagababov left a comment

Choose a reason for hiding this comment

markusthoemmes left a comment

Choose a reason for hiding this comment

hohaichi commented Mar 21, 2019

hohaichi commented Mar 25, 2019

hohaichi commented Mar 25, 2019

vagababov left a comment

Choose a reason for hiding this comment

markusthoemmes left a comment

Choose a reason for hiding this comment

hohaichi commented Mar 26, 2019

knative-metrics-robot commented Mar 26, 2019

yanweiguo commented Mar 26, 2019

markusthoemmes left a comment

Choose a reason for hiding this comment

knative-prow-robot commented Mar 26, 2019

hohaichi commented Mar 26, 2019