http ratelimit: option to reduce budget on stream done #37548

mathetake · 2024-12-06T22:07:51Z

Commit Message: ratelimit: option to excute action on stream done

Additional Description:
This adds a new option apply_on_stream_done to the rate limit
policy corresponding to each descriptor. This basically allows to configure
descriptors to be executed in a response content-aware way and do not
enforce the rate limit (in other words "fire-and-forget"). Since addend
can be currently controlled via metadata per descriptor,
another filter can be used to set the value to reflect their intent there,
for example, by using Lua or Ext Proc filters.

This use case arises from the LLM API services which usually return
the usage statistics in the response body. More specifically,
they have "streaming" APIs whose response is a line-by-line event
stream where the very last line of the response line contains the
usage statistics. The lazy nature of this action is perfectly fine
as in these use cases, the rate limit happens like "you are forbidden
from the next time".

Besides the LLM specific, I've also encountered the use case from the
data center resource allocation case where the operators want to
"block the computation from the next time since you used this much
resources in this request".

Ref: envoyproxy/gateway#4756

Risk Level: low
Testing: done
Docs Changes: done
Release Notes: TODO
Platform Specific Features: n/a

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

repokitteh-read-only · 2024-12-06T22:07:56Z

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #37548 was opened by mathetake.

see: more, trace.

repokitteh-read-only · 2024-12-06T22:08:02Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @wbpcode
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #37548 was opened by mathetake.

see: more, trace.

api/envoy/extensions/filters/http/ratelimit/v3/rate_limit.proto

mathetake · 2024-12-06T22:12:08Z

i guess the impl can be a bit large, so I might do that in separate PRs - anyways will think about it after the API gets approved

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode · 2024-12-07T01:06:30Z

wow, we have a similar requirement internally and I finally figured out a similar way. It is super surprised and happy to see this.

mathetake · 2024-12-07T18:15:13Z

cool glad to hear that you came to the similar idea!

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

…both for clarity Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

…nd for future extension Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake · 2024-12-09T17:35:13Z

@wbpcode thank you for the valuable feedback offline! I think I will go ahead and try implementing the idea - i don't think the change won't be that huge

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

source/extensions/filters/http/ratelimit/ratelimit.cc

api/envoy/extensions/filters/http/ratelimit/v3/rate_limit.proto

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode

Thanks for this contribution. This extend the usage of the rate limit filter largely. Some comments are added.

source/extensions/filters/http/ratelimit/ratelimit.cc

source/extensions/filters/common/ratelimit/ratelimit_impl.cc

source/extensions/filters/http/ratelimit/ratelimit.cc

source/extensions/filters/http/ratelimit/ratelimit.h

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode

Thanks for the update. It's much better now. Some more comments are added. And please keep the route and route specific config in the filter at the begin of the requet. Or the route refreshment may results in that the encoding phase has different configuration with the decoding phase.

source/common/router/router_ratelimit.h

source/extensions/filters/http/ratelimit/ratelimit.cc

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake · 2024-12-18T08:29:59Z

meanwhile i am working on the integration tests now ... some cases are failing

source/extensions/filters/common/ratelimit/ratelimit_impl.cc

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

source/extensions/filters/http/ratelimit/ratelimit.cc

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode

LGTM. Thanks for the contribution.

wbpcode · 2024-12-18T11:58:57Z

cc @mattklein123 cc @tyxia for any additional comments.

ratelimit: option to excute action on stream done

9deac5f

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

repokitteh-read-only bot added the api label Dec 6, 2024

repokitteh-read-only bot assigned wbpcode Dec 6, 2024

mathetake commented Dec 6, 2024

View reviewed changes

api/envoy/extensions/filters/http/ratelimit/v3/rate_limit.proto Outdated Show resolved Hide resolved

mathetake marked this pull request as ready for review December 6, 2024 22:10

mathetake added 2 commits December 6, 2024 22:27

format

5fe0370

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

more clarify

32ea3e9

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake added 3 commits December 7, 2024 18:25

clarify more

e346217

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Apply feedback in offline: action is either request or response, not …

c8e0b73

…both for clarity Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Apply offline review: remove the mention of envoy.ratelimit.hits_adde…

6166b90

…nd for future extension Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake added 2 commits December 9, 2024 17:39

more comments

cab0154

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Put it in the correct place

371e4c6

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake changed the title ~~http ratelimit: option to excute action on stream done~~ ratelimit: option to excute action on stream done Dec 9, 2024

This comment was marked as outdated.

Sign in to view

Filter config level

7e77a7f

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake requested a review from mattklein123 as a code owner December 9, 2024 22:15

mathetake changed the title ~~ratelimit: option to excute action on stream done~~ http ratelimit: option to excute action on stream done Dec 9, 2024

This comment was marked as outdated.

Sign in to view

mathetake changed the title ~~http ratelimit: option to excute action on stream done~~ http ratelimit: option to reduce budget on stream done Dec 9, 2024

mathetake commented Dec 9, 2024

View reviewed changes

source/extensions/filters/http/ratelimit/ratelimit.cc Outdated Show resolved Hide resolved

mathetake commented Dec 9, 2024

View reviewed changes

source/extensions/filters/http/ratelimit/ratelimit.cc Outdated Show resolved Hide resolved

arkodg reviewed Dec 11, 2024

View reviewed changes

api/envoy/extensions/filters/http/ratelimit/v3/rate_limit.proto Outdated Show resolved Hide resolved

zirain mentioned this pull request Dec 11, 2024

support ReturnDescriptorsInResponse envoyproxy/ratelimit#752

Closed

mention envoy.ratelimit.hits_addend

8f29563

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode reviewed Dec 17, 2024

View reviewed changes

mathetake added 3 commits December 17, 2024 23:18

reviews: apply comments

702e427

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Merge remote-tracking branch 'origin/main' into actionapplyondone

5aaa9c1

review: get descriptors in done

f3c7b32

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

mathetake requested a review from wbpcode December 18, 2024 00:48

mathetake added 2 commits December 18, 2024 04:33

typo

74a6cc9

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

remove unnecessary change

0ba3b25

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode reviewed Dec 18, 2024

View reviewed changes

Apply review comments

8823652

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode reviewed Dec 18, 2024

View reviewed changes

source/extensions/filters/common/ratelimit/ratelimit_impl.cc Show resolved Hide resolved

mathetake added 3 commits December 18, 2024 09:25

review: save route_ and per-route vh_rate_limits_ to for consistency

946ce6f

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

Merge remote-tracking branch 'origin/main' into actionapplyondone

b100c6d

simplify

5dec975

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

repokitteh-read-only bot added the api label Dec 18, 2024

wbpcode reviewed Dec 18, 2024

View reviewed changes

source/extensions/filters/http/ratelimit/ratelimit.cc Outdated Show resolved Hide resolved

source/extensions/filters/http/ratelimit/ratelimit.cc Outdated Show resolved Hide resolved

apply review commments

b76e3d7

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>

wbpcode approved these changes Dec 18, 2024

View reviewed changes

repokitteh-read-only bot removed the api label Dec 18, 2024

wbpcode enabled auto-merge (squash) December 18, 2024 11:51

wbpcode disabled auto-merge December 18, 2024 11:55

wbpcode mentioned this pull request Dec 18, 2024

global rate limit: supported ratelimits in the typed per filter config #37684

Merged

wbpcode merged commit 857107b into envoyproxy:main Dec 19, 2024
26 checks passed

mathetake deleted the actionapplyondone branch December 19, 2024 02:59

mathetake mentioned this pull request Jan 3, 2025

extproc: sets token usage into filter metadata envoyproxy/ai-gateway#62

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

http ratelimit: option to reduce budget on stream done #37548

http ratelimit: option to reduce budget on stream done #37548

mathetake commented Dec 6, 2024 •

edited

Loading

repokitteh-read-only bot commented Dec 6, 2024

repokitteh-read-only bot commented Dec 6, 2024

mathetake commented Dec 6, 2024 •

edited

Loading

wbpcode commented Dec 7, 2024 •

edited

Loading

mathetake commented Dec 7, 2024

mathetake commented Dec 9, 2024

This comment was marked as outdated.

This comment was marked as outdated.

wbpcode left a comment

wbpcode left a comment

mathetake commented Dec 18, 2024

wbpcode left a comment

wbpcode commented Dec 18, 2024 •

edited

Loading

http ratelimit: option to reduce budget on stream done #37548

http ratelimit: option to reduce budget on stream done #37548

Conversation

mathetake commented Dec 6, 2024 • edited Loading

repokitteh-read-only bot commented Dec 6, 2024

repokitteh-read-only bot commented Dec 6, 2024

mathetake commented Dec 6, 2024 • edited Loading

wbpcode commented Dec 7, 2024 • edited Loading

mathetake commented Dec 7, 2024

mathetake commented Dec 9, 2024

This comment was marked as outdated.

This comment was marked as outdated.

wbpcode left a comment

Choose a reason for hiding this comment

wbpcode left a comment

Choose a reason for hiding this comment

mathetake commented Dec 18, 2024

wbpcode left a comment

Choose a reason for hiding this comment

wbpcode commented Dec 18, 2024 • edited Loading

mathetake commented Dec 6, 2024 •

edited

Loading

mathetake commented Dec 6, 2024 •

edited

Loading

wbpcode commented Dec 7, 2024 •

edited

Loading

wbpcode commented Dec 18, 2024 •

edited

Loading