Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new stat metric in queue to prevent double counting #3477

Merged
merged 5 commits into from
Mar 26, 2019

Conversation

hohaichi
Copy link
Contributor

Currently a request pending in both the activator and queue proxy is counted twice, once in the activator, and one in the queue proxy. This change fixes it by letting queue report a concurrency metric for proxied requests and letting the autoscaler discount such concurrency when it calculates the total concurrency for scaling decision.

Fixes #3301

Proposed Changes

  • Add a new Knative header for tracking where a request goes through. And let the activator use it to mark proxied requests.
  • Let queue report a new metric for the average concurrency of requests proxied through activator.
  • Autoscaler discount the average proxied concurrency from the average concurrency from queue when it calculate the total concurrency for scaling decision.

Release Note


@knative-prow-robot knative-prow-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/autoscale area/networking labels Mar 21, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hohaichi: 0 warnings.

In response to this:

Currently a request pending in both the activator and queue proxy is counted twice, once in the activator, and one in the queue proxy. This change fixes it by letting queue report a concurrency metric for proxied requests and letting the autoscaler discount such concurrency when it calculates the total concurrency for scaling decision.

Fixes #3301

Proposed Changes

  • Add a new Knative header for tracking where a request goes through. And let the activator use it to mark proxied requests.
  • Let queue report a new metric for the average concurrency of requests proxied through activator.
  • Autoscaler discount the average proxied concurrency from the average concurrency from queue when it calculate the total concurrency for scaling decision.

Release Note


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks fine.
But it still has the same attack vector that the underlying issue talks about.
Perhaps the header should be dynamic?

Copy link
Contributor

@markusthoemmes markusthoemmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat execution, thanks! Left a few comments throughout.

@hohaichi
Copy link
Contributor Author

In general looks fine.
But it still has the same attack vector that the underlying issue talks about.
Perhaps the header should be dynamic?
I think an attacker can attack the same way using 'k-network-probe' header. It's probably not a real thread, or we should have the same solution for both.

@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 25, 2019
Currently a request pending in both the activator and queue proxy is
counted twice, once in the activator, and one in the queue proxy. This
change fixes it by letting queue report a concurrency metric for proxied
requests and letting the autoscaler discount such concurrency when it
calculates the total concurrency for scaling decision.
@hohaichi hohaichi changed the title [WIP] Add a new stat metric in queue to prevent double counting Add a new stat metric in queue to prevent double counting Mar 25, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 25, 2019
@hohaichi
Copy link
Contributor Author

Updated with tests and rebased.

@hohaichi
Copy link
Contributor Author

/assign @mdemirhan
/assign @markusthoemmes

Copy link
Contributor

@vagababov vagababov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits.
Rest is fine with me.

Copy link
Contributor

@markusthoemmes markusthoemmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM but a few comments on naming and adding comments describing the values.

@hohaichi
Copy link
Contributor Author

/assign @yanweiguo

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/autoscaler/autoscaler.go 97.2% 97.0% -0.2
pkg/autoscaler/stats_scraper.go 83.7% 90.3% 6.6
pkg/queue/stats_reporter.go 87.8% 87.5% -0.3

@yanweiguo
Copy link
Contributor

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2019
Copy link
Contributor

@markusthoemmes markusthoemmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

Thanks for all the adjustments 🎉

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hohaichi, markusthoemmes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2019
@mattmoor mattmoor added this to the Serving 0.5 milestone Mar 26, 2019
@hohaichi
Copy link
Contributor Author

/retest

@knative-prow-robot knative-prow-robot merged commit 969afd8 into knative:master Mar 26, 2019
@hohaichi hohaichi deleted the i3301 branch March 26, 2019 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/autoscale area/networking lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't double account for requests going through the activator.
9 participants