-
Notifications
You must be signed in to change notification settings - Fork 7k
use correct route value in proxy metrics tags instead of route_prefix #58180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
akyang-anyscale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential optimization is to use the matched pattern in the replica metrics as well to not duplicate work
Excellent idea. Doing it now |
discussed offline, idea is to pass the matched route from proxy to replica through request metatdata. Taking this up in a follow up PR |
ok-scale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using starlette matcher was much better! LGTM
…ray-project#58180) ## Expose Route Patterns in Proxy Metrics fixes ray-project#52212 ### Problem Proxy metrics (`ray_serve_num_http_requests_total`, `ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g., `/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths. ### Design **Route Pattern Extraction & Propagation:** - Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at initialization using `extract_route_patterns()` - Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` → `EndpointInfo` → Proxy - Works with both normal patterns (routes in class) and factory patterns (callable returns app) **Proxy Route Matching:** - `ProxyRouter.match_route_pattern()` matches incoming requests to specific patterns using cached mock Starlette apps - Metrics tag requests with parameterized routes (e.g., `/api/users/{user_id}`) instead of prefixes - Fallback to `route_prefix` if patterns unavailable or matching fails **Performance:** Metric | Before | After -- | -- | -- Requests per second (RPS) | 403.39 | 397.82 Mean latency (ms) | 247.9 | 251.37 p50 (ms) | 224 | 223 p90 (ms) | 415 | 428 p99 (ms) | 526 | 544 ### Testing - Unit tests for `extract_route_patterns()` - Integration test verifying metrics use patterns and avoid high cardinality - Parametrized for both normal and factory patterns --------- Signed-off-by: abrar <abrar@anyscale.com>
…#58180) ## Expose Route Patterns in Proxy Metrics fixes #52212 ### Problem Proxy metrics (`ray_serve_num_http_requests_total`, `ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g., `/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths. ### Design **Route Pattern Extraction & Propagation:** - Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at initialization using `extract_route_patterns()` - Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` → `EndpointInfo` → Proxy - Works with both normal patterns (routes in class) and factory patterns (callable returns app) **Proxy Route Matching:** - `ProxyRouter.match_route_pattern()` matches incoming requests to specific patterns using cached mock Starlette apps - Metrics tag requests with parameterized routes (e.g., `/api/users/{user_id}`) instead of prefixes - Fallback to `route_prefix` if patterns unavailable or matching fails **Performance:** Metric | Before | After -- | -- | -- Requests per second (RPS) | 403.39 | 397.82 Mean latency (ms) | 247.9 | 251.37 p50 (ms) | 224 | 223 p90 (ms) | 415 | 428 p99 (ms) | 526 | 544 ### Testing - Unit tests for `extract_route_patterns()` - Integration test verifying metrics use patterns and avoid high cardinality - Parametrized for both normal and factory patterns --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…ray-project#58180) ## Expose Route Patterns in Proxy Metrics fixes ray-project#52212 ### Problem Proxy metrics (`ray_serve_num_http_requests_total`, `ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g., `/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths. ### Design **Route Pattern Extraction & Propagation:** - Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at initialization using `extract_route_patterns()` - Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` → `EndpointInfo` → Proxy - Works with both normal patterns (routes in class) and factory patterns (callable returns app) **Proxy Route Matching:** - `ProxyRouter.match_route_pattern()` matches incoming requests to specific patterns using cached mock Starlette apps - Metrics tag requests with parameterized routes (e.g., `/api/users/{user_id}`) instead of prefixes - Fallback to `route_prefix` if patterns unavailable or matching fails **Performance:** Metric | Before | After -- | -- | -- Requests per second (RPS) | 403.39 | 397.82 Mean latency (ms) | 247.9 | 251.37 p50 (ms) | 224 | 223 p90 (ms) | 415 | 428 p99 (ms) | 526 | 544 ### Testing - Unit tests for `extract_route_patterns()` - Integration test verifying metrics use patterns and avoid high cardinality - Parametrized for both normal and factory patterns --------- Signed-off-by: abrar <abrar@anyscale.com>
…ray-project#58180) ## Expose Route Patterns in Proxy Metrics fixes ray-project#52212 ### Problem Proxy metrics (`ray_serve_num_http_requests_total`, `ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g., `/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths. ### Design **Route Pattern Extraction & Propagation:** - Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at initialization using `extract_route_patterns()` - Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` → `EndpointInfo` → Proxy - Works with both normal patterns (routes in class) and factory patterns (callable returns app) **Proxy Route Matching:** - `ProxyRouter.match_route_pattern()` matches incoming requests to specific patterns using cached mock Starlette apps - Metrics tag requests with parameterized routes (e.g., `/api/users/{user_id}`) instead of prefixes - Fallback to `route_prefix` if patterns unavailable or matching fails **Performance:** Metric | Before | After -- | -- | -- Requests per second (RPS) | 403.39 | 397.82 Mean latency (ms) | 247.9 | 251.37 p50 (ms) | 224 | 223 p90 (ms) | 415 | 428 p99 (ms) | 526 | 544 ### Testing - Unit tests for `extract_route_patterns()` - Integration test verifying metrics use patterns and avoid high cardinality - Parametrized for both normal and factory patterns --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…te matching (#58927) The correct route value is already part of RequestMetadata after #58180, no need to recompute it again. no observed perf diff in microbenchmark After ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 28068 0 200 410 470 228.27 80 592 26 430.3 0 Aggregated 28068 0 200 410 470 228.27 80 592 26 430.3 0 ``` Before ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 27427 0 210 410 470 232.12 76 604 26 429.7 0 Aggregated 27427 0 210 410 470 232.12 76 604 26 429.7 0 ``` Additionally, old implementation wrongly assumed that there will only be one method (GET,PUT) corresponding to a route. This PR fixes that assumption and tests for it. --------- Signed-off-by: abrar <abrar@anyscale.com>
…te matching (ray-project#58927) The correct route value is already part of RequestMetadata after ray-project#58180, no need to recompute it again. no observed perf diff in microbenchmark After ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 28068 0 200 410 470 228.27 80 592 26 430.3 0 Aggregated 28068 0 200 410 470 228.27 80 592 26 430.3 0 ``` Before ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 27427 0 210 410 470 232.12 76 604 26 429.7 0 Aggregated 27427 0 210 410 470 232.12 76 604 26 429.7 0 ``` Additionally, old implementation wrongly assumed that there will only be one method (GET,PUT) corresponding to a route. This PR fixes that assumption and tests for it. --------- Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
…te matching (ray-project#58927) The correct route value is already part of RequestMetadata after ray-project#58180, no need to recompute it again. no observed perf diff in microbenchmark After ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 28068 0 200 410 470 228.27 80 592 26 430.3 0 Aggregated 28068 0 200 410 470 228.27 80 592 26 430.3 0 ``` Before ``` Type Name # Requests # Fails Median (ms) 95%ile (ms) 99%ile (ms) Average (ms) Min (ms) Max (ms) Average size (bytes) Current RPS Current Failures/s GET /echo?message=hello 27427 0 210 410 470 232.12 76 604 26 429.7 0 Aggregated 27427 0 210 410 470 232.12 76 604 26 429.7 0 ``` Additionally, old implementation wrongly assumed that there will only be one method (GET,PUT) corresponding to a route. This PR fixes that assumption and tests for it. --------- Signed-off-by: abrar <abrar@anyscale.com>
Expose Route Patterns in Proxy Metrics
fixes #52212
Problem
Proxy metrics (
ray_serve_num_http_requests_total,ray_serve_http_request_latency_ms) only exposeroute_prefix(e.g.,/api) instead of actual route patterns (e.g.,/api/users/{user_id}). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths.Design
Route Pattern Extraction & Propagation:
extract_route_patterns()ReplicaMetadata→DeploymentState→EndpointInfo→ ProxyProxy Route Matching:
ProxyRouter.match_route_pattern()matches incoming requests to specific patterns using cached mock Starlette apps/api/users/{user_id}) instead of prefixesroute_prefixif patterns unavailable or matching failsPerformance:
Testing
extract_route_patterns()