Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

@abrarsheikh abrarsheikh commented Oct 26, 2025

Expose Route Patterns in Proxy Metrics

fixes #52212

Problem

Proxy metrics (ray_serve_num_http_requests_total, ray_serve_http_request_latency_ms) only expose route_prefix (e.g., /api) instead of actual route patterns (e.g., /api/users/{user_id}). This prevents granular monitoring of individual endpoints without causing high cardinality from unique request paths.

Design

Route Pattern Extraction & Propagation:

  • Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at initialization using extract_route_patterns()
  • Patterns propagate: Replica → ReplicaMetadataDeploymentStateEndpointInfo → Proxy
  • Works with both normal patterns (routes in class) and factory patterns (callable returns app)

Proxy Route Matching:

  • ProxyRouter.match_route_pattern() matches incoming requests to specific patterns using cached mock Starlette apps
  • Metrics tag requests with parameterized routes (e.g., /api/users/{user_id}) instead of prefixes
  • Fallback to route_prefix if patterns unavailable or matching fails

Performance:

Metric Before After
Requests per second (RPS) 403.39 397.82
Mean latency (ms) 247.9 251.37
p50 (ms) 224 223
p90 (ms) 415 428
p99 (ms) 526 544

Testing

  • Unit tests for extract_route_patterns()
  • Integration test verifying metrics use patterns and avoid high cardinality
  • Parametrized for both normal and factory patterns

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Oct 26, 2025
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh marked this pull request as ready for review October 27, 2025 16:52
@abrarsheikh abrarsheikh requested a review from a team as a code owner October 27, 2025 16:52
@ray-gardener ray-gardener bot added serve Ray Serve Related Issue observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling labels Oct 27, 2025
Copy link
Contributor

@akyang-anyscale akyang-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential optimization is to use the matched pattern in the replica metrics as well to not duplicate work

@abrarsheikh
Copy link
Contributor Author

potential optimization is to use the matched pattern in the replica metrics as well to not duplicate work

Excellent idea. Doing it now

@abrarsheikh
Copy link
Contributor Author

potential optimization is to use the matched pattern in the replica metrics as well to not duplicate work

Excellent idea. Doing it now

discussed offline, idea is to pass the matched route from proxy to replica through request metatdata. Taking this up in a follow up PR

Copy link
Contributor

@ok-scale ok-scale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using starlette matcher was much better! LGTM

@abrarsheikh abrarsheikh merged commit e466c6f into master Oct 29, 2025
6 checks passed
@abrarsheikh abrarsheikh deleted the 52212-abrar-proxy_metrics branch October 29, 2025 18:39
YoussefEssDS pushed a commit to YoussefEssDS/ray that referenced this pull request Nov 8, 2025
…ray-project#58180)

## Expose Route Patterns in Proxy Metrics

fixes ray-project#52212

### Problem
Proxy metrics (`ray_serve_num_http_requests_total`,
`ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g.,
`/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`).
This prevents granular monitoring of individual endpoints without
causing high cardinality from unique request paths.

### Design
**Route Pattern Extraction & Propagation:**
- Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at
initialization using `extract_route_patterns()`
- Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` →
`EndpointInfo` → Proxy
- Works with both normal patterns (routes in class) and factory patterns
(callable returns app)

**Proxy Route Matching:**
- `ProxyRouter.match_route_pattern()` matches incoming requests to
specific patterns using cached mock Starlette apps
- Metrics tag requests with parameterized routes (e.g.,
`/api/users/{user_id}`) instead of prefixes
- Fallback to `route_prefix` if patterns unavailable or matching fails

**Performance:**


Metric | Before | After
-- | -- | --
Requests per second (RPS) | 403.39 | 397.82
Mean latency (ms) | 247.9 | 251.37
p50 (ms) | 224 | 223
p90 (ms) | 415 | 428
p99 (ms) | 526 | 544

### Testing
- Unit tests for `extract_route_patterns()`
- Integration test verifying metrics use patterns and avoid high
cardinality
- Parametrized for both normal and factory patterns

---------

Signed-off-by: abrar <abrar@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Nov 14, 2025
…#58180)

## Expose Route Patterns in Proxy Metrics

fixes #52212

### Problem
Proxy metrics (`ray_serve_num_http_requests_total`,
`ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g.,
`/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`).
This prevents granular monitoring of individual endpoints without
causing high cardinality from unique request paths.

### Design
**Route Pattern Extraction & Propagation:**
- Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at
initialization using `extract_route_patterns()`
- Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` →
`EndpointInfo` → Proxy
- Works with both normal patterns (routes in class) and factory patterns
(callable returns app)

**Proxy Route Matching:**
- `ProxyRouter.match_route_pattern()` matches incoming requests to
specific patterns using cached mock Starlette apps
- Metrics tag requests with parameterized routes (e.g.,
`/api/users/{user_id}`) instead of prefixes
- Fallback to `route_prefix` if patterns unavailable or matching fails

**Performance:**


Metric | Before | After
-- | -- | --
Requests per second (RPS) | 403.39 | 397.82
Mean latency (ms) | 247.9 | 251.37
p50 (ms) | 224 | 223
p90 (ms) | 415 | 428
p99 (ms) | 526 | 544

### Testing
- Unit tests for `extract_route_patterns()`
- Integration test verifying metrics use patterns and avoid high
cardinality
- Parametrized for both normal and factory patterns

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…ray-project#58180)

## Expose Route Patterns in Proxy Metrics

fixes ray-project#52212

### Problem
Proxy metrics (`ray_serve_num_http_requests_total`,
`ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g.,
`/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`).
This prevents granular monitoring of individual endpoints without
causing high cardinality from unique request paths.

### Design
**Route Pattern Extraction & Propagation:**
- Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at
initialization using `extract_route_patterns()`
- Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` →
`EndpointInfo` → Proxy
- Works with both normal patterns (routes in class) and factory patterns
(callable returns app)

**Proxy Route Matching:**
- `ProxyRouter.match_route_pattern()` matches incoming requests to
specific patterns using cached mock Starlette apps
- Metrics tag requests with parameterized routes (e.g.,
`/api/users/{user_id}`) instead of prefixes
- Fallback to `route_prefix` if patterns unavailable or matching fails

**Performance:**


Metric | Before | After
-- | -- | --
Requests per second (RPS) | 403.39 | 397.82
Mean latency (ms) | 247.9 | 251.37
p50 (ms) | 224 | 223
p90 (ms) | 415 | 428
p99 (ms) | 526 | 544

### Testing
- Unit tests for `extract_route_patterns()`
- Integration test verifying metrics use patterns and avoid high
cardinality
- Parametrized for both normal and factory patterns

---------

Signed-off-by: abrar <abrar@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…ray-project#58180)

## Expose Route Patterns in Proxy Metrics

fixes ray-project#52212

### Problem
Proxy metrics (`ray_serve_num_http_requests_total`,
`ray_serve_http_request_latency_ms`) only expose `route_prefix` (e.g.,
`/api`) instead of actual route patterns (e.g., `/api/users/{user_id}`).
This prevents granular monitoring of individual endpoints without
causing high cardinality from unique request paths.

### Design
**Route Pattern Extraction & Propagation:**
- Replicas extract route patterns from ASGI apps (FastAPI/Starlette) at
initialization using `extract_route_patterns()`
- Patterns propagate: Replica → `ReplicaMetadata` → `DeploymentState` →
`EndpointInfo` → Proxy
- Works with both normal patterns (routes in class) and factory patterns
(callable returns app)

**Proxy Route Matching:**
- `ProxyRouter.match_route_pattern()` matches incoming requests to
specific patterns using cached mock Starlette apps
- Metrics tag requests with parameterized routes (e.g.,
`/api/users/{user_id}`) instead of prefixes
- Fallback to `route_prefix` if patterns unavailable or matching fails

**Performance:**

Metric | Before | After
-- | -- | --
Requests per second (RPS) | 403.39 | 397.82
Mean latency (ms) | 247.9 | 251.37
p50 (ms) | 224 | 223
p90 (ms) | 415 | 428
p99 (ms) | 526 | 544

### Testing
- Unit tests for `extract_route_patterns()`
- Integration test verifying metrics use patterns and avoid high
cardinality
- Parametrized for both normal and factory patterns

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
abrarsheikh added a commit that referenced this pull request Nov 25, 2025
…te matching (#58927)

The correct route value is already part of RequestMetadata after
#58180, no need to recompute it
again.

no observed perf diff in microbenchmark
After
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	28068	0	200	410	470	228.27	80	592	26	430.3	0
Aggregated	28068	0	200	410	470	228.27	80	592	26	430.3	0
```

Before
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	27427	0	210	410	470	232.12	76	604	26	429.7	0
Aggregated	27427	0	210	410	470	232.12	76	604	26	429.7	0
```

Additionally, old implementation wrongly assumed that there will only be
one method (GET,PUT) corresponding to a route. This PR fixes that
assumption and tests for it.

---------

Signed-off-by: abrar <abrar@anyscale.com>
ykdojo pushed a commit to ykdojo/ray that referenced this pull request Nov 27, 2025
…te matching (ray-project#58927)

The correct route value is already part of RequestMetadata after
ray-project#58180, no need to recompute it
again.

no observed perf diff in microbenchmark
After
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	28068	0	200	410	470	228.27	80	592	26	430.3	0
Aggregated	28068	0	200	410	470	228.27	80	592	26	430.3	0
```

Before
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	27427	0	210	410	470	232.12	76	604	26	429.7	0
Aggregated	27427	0	210	410	470	232.12	76	604	26	429.7	0
```

Additionally, old implementation wrongly assumed that there will only be
one method (GET,PUT) corresponding to a route. This PR fixes that
assumption and tests for it.

---------

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
SheldonTsen pushed a commit to SheldonTsen/ray that referenced this pull request Dec 1, 2025
…te matching (ray-project#58927)

The correct route value is already part of RequestMetadata after
ray-project#58180, no need to recompute it
again.

no observed perf diff in microbenchmark
After
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	28068	0	200	410	470	228.27	80	592	26	430.3	0
Aggregated	28068	0	200	410	470	228.27	80	592	26	430.3	0
```

Before
```
Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
GET	/echo?message=hello	27427	0	210	410	470	232.12	76	604	26	429.7	0
Aggregated	27427	0	210	410	470	232.12	76	604	26	429.7	0
```

Additionally, old implementation wrongly assumed that there will only be
one method (GET,PUT) corresponding to a route. This PR fixes that
assumption and tests for it.

---------

Signed-off-by: abrar <abrar@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] change the metric tag for the proxy metrics to route_prefix for clarity

4 participants