Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve concurrent request bucketing in queue-proxy. #1091

Merged
merged 5 commits into from
Jun 13, 2018
Merged

Improve concurrent request bucketing in queue-proxy. #1091

merged 5 commits into from
Jun 13, 2018

Conversation

markusthoemmes
Copy link
Contributor

Fixes #1060

Proposed Changes

Instead of counting all requests that arrived in a certain bucket as concurrent, the queue proxy now reports the actual maximum concurrency that was present inside of one bucket.

If for example 3 requests arrive at once, the maximum concurrency will be 3 for that bucket. If another arises while the 3 remaining are still open, maximum concurrency is 4. Closing requests results in a decrement on the concurrency immediatly (versus draining the outgoing request queue on quantization which results in the behavior described above).

Release Note

Improved concurrent request bucketing of the queue-proxy to report more accurate values.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@google-prow-robot google-prow-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 7, 2018
reqInChan = make(chan queue.Poke, requestCountingQueueLength)
reqOutChan = make(chan queue.Poke, requestCountingQueueLength)
reqInChan = make(chan queue.Poke)
reqOutChan = make(chan queue.Poke)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fairly new to go so please bear with me: My understanding is, that we shouldn't buffer here anymore. In an extreme case, this could lead to reporting a higher concurrency than what we actually see in the container (since In and Out are different channels). Maybe it makes sense instead to move away from "Poke" but have "In" and "Out" and push those through the same channel?

Copy link
Member

@dprotaso dprotaso Jun 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By not having a buffer the side effect is http requests that are being proxied through the queue will block waiting for something to read from this channel. Looking at the code this would likely happen when sending stats to the autoscaler is on a slow network.

It might be worth thinking about queue.Poke be or contain a timestamp. Then when aggregating we check if the time falls within our bucket interval and maybe drop the poke if it's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dprotaso the channel for sending stats is still buffered though. My understanding is, this only makes incrementing/decrementing the concurrency counter blocking, which might be okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow request handling to block on stat reporting. So I think the buffered channel should remain. We could push both in and out through the same channel, although I don't think it makes much of a difference. Order is not critical here.

We can think about what happens when stat reporting gets way behind. Right now, concurrency "happens" when the stat reporter sees it. The problem with making the Poke a Timestamp is that we have to keep in and out balanced. Otherwise concurrency won't return to zero. E.g. if we disregard one "in" because it's late, we must remember to disregard one "out". Which one?

@markusthoemmes
Copy link
Contributor Author

/assign @josephburnett

@rootfs
Copy link
Contributor

rootfs commented Jun 7, 2018

@markusthoemmes need to sign a CLA

@markusthoemmes
Copy link
Contributor Author

I signed the CLA!

@googlebot
Copy link

CLAs look good, thanks!

@google-prow-robot google-prow-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 8, 2018
Copy link
Contributor

@josephburnett josephburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. My only concern is removing the channel buffering.

reqInChan = make(chan queue.Poke, requestCountingQueueLength)
reqOutChan = make(chan queue.Poke, requestCountingQueueLength)
reqInChan = make(chan queue.Poke)
reqOutChan = make(chan queue.Poke)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow request handling to block on stat reporting. So I think the buffered channel should remain. We could push both in and out through the same channel, although I don't think it makes much of a difference. Order is not critical here.

We can think about what happens when stat reporting gets way behind. Right now, concurrency "happens" when the stat reporter sees it. The problem with making the Poke a Timestamp is that we have to keep in and out balanced. Otherwise concurrency won't return to zero. E.g. if we disregard one "in" because it's late, we must remember to disregard one "out". Which one?

Instead of counting all requests that arrived in a certain bucket as concurrent, the queue proxy now reports the actual maximum concurrency that was present inside of one bucket.

If for example 3 requests arrive at once, the maximum concurrency will be 3 for that bucket. If another arises while the 3 remaining are still open, maximum concurrency is 4. Closing requests results in a decrement on the concurrency immediatly (versus draining the outgoing request queue on quantization which results in the behavior described above).

Closes #1060
@markusthoemmes
Copy link
Contributor Author

/retest

// Ticks with every request completed
ReqOutChan chan Poke
// Ticks with every request arrived/completed respectively
ReqChan chan interface{}
Copy link
Contributor

@josephburnett josephburnett Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: chan interface{} seems too open to me. You can throw anything on the channel and the compiler won't tell you there's a problem. But binary will crash when the switch statement has no match.

How about an enumerated type instead?

type StatEvent int
const (
    ReqIn StatEvent = iota
    ReqOut
)

ReqChan chan ReqEvent

switch event {
case ReqIn:
case ReqOut:
}

@josephburnett
Copy link
Contributor

/lgtm
/approve

@google-prow-robot google-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 13, 2018
Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

for ./config/

@google-prow-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: josephburnett, markusthoemmes, mattmoor

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-prow-robot google-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 13, 2018
@google-prow-robot google-prow-robot merged commit baf4b24 into knative:master Jun 13, 2018
skonto added a commit to skonto/serving that referenced this pull request Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants