Skip to content

Conversation

@abrarsheikh
Copy link
Contributor

part 1 of #56149

  1. move _serialized_policy_def into AutoscalingPolicy from AutoscalingConfig. We need this in order to reuse AutoscalingPolicy for application-level autoscaling.
  2. Make autoscaling_policy a top-level config in ServeApplicationSchema.

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh requested a review from a team as a code owner October 8, 2025 02:53
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the autoscaling policy configuration by moving the serialization logic from AutoscalingConfig into AutoscalingPolicy. It also introduces an application-level autoscaling_policy in ServeApplicationSchema. The changes are logical and improve encapsulation.

I've identified a critical issue in the protobuf schema update that breaks wire compatibility, and a medium-severity issue regarding schema definition in ServeApplicationSchema. Please see the detailed comments.

Comment on lines 72 to +87
// The autoscaling policy definition.
AutoscalingPolicy policy = 12;
AutoscalingPolicy policy = 11;

// Target number of in flight requests per replica. This is the primary configuration
// knob for replica autoscaler. Lower the number, the more rapidly the replicas
// scales up. Must be a non-negative integer.
double target_ongoing_requests = 13;
double target_ongoing_requests = 12;

// The multiplicative "gain" factor to limit upscale.
optional double upscaling_factor = 14;
optional double upscaling_factor = 13;

// The multiplicative "gain" factor to limit downscale.
optional double downscaling_factor = 15;
optional double downscaling_factor = 14;

// How long to wait before scaling down replicas from 1 to 0
optional double downscale_to_zero_delay_s = 16;
optional double downscale_to_zero_delay_s = 15;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Renumbering protobuf fields is a wire-incompatible breaking change. An older component serializing an AutoscalingConfig message will use field numbers 12, 13, etc., for policy, target_ongoing_requests, etc. A newer component will interpret these as target_ongoing_requests, upscaling_factor, etc., leading to silent data corruption or deserialization errors.

To maintain compatibility, you should not renumber the existing fields. Please revert the field numbers to their original values. It's also a good practice to mark the removed field number as reserved to prevent accidental reuse, like this:

...
optional double downscale_smoothing_factor = 10;

reserved 11;

// The autoscaling policy definition.
AutoscalingPolicy policy = 12;
...
Suggested change
// The autoscaling policy definition.
AutoscalingPolicy policy = 12;
AutoscalingPolicy policy = 11;
// Target number of in flight requests per replica. This is the primary configuration
// knob for replica autoscaler. Lower the number, the more rapidly the replicas
// scales up. Must be a non-negative integer.
double target_ongoing_requests = 13;
double target_ongoing_requests = 12;
// The multiplicative "gain" factor to limit upscale.
optional double upscaling_factor = 14;
optional double upscaling_factor = 13;
// The multiplicative "gain" factor to limit downscale.
optional double downscaling_factor = 15;
optional double downscaling_factor = 14;
// How long to wait before scaling down replicas from 1 to 0
optional double downscale_to_zero_delay_s = 16;
optional double downscale_to_zero_delay_s = 15;
// The autoscaling policy definition.
AutoscalingPolicy policy = 12;
// Target number of in flight requests per replica. This is the primary configuration
// knob for replica autoscaler. Lower the number, the more rapidly the replicas
// scales up. Must be a non-negative integer.
double target_ongoing_requests = 13;
// The multiplicative "gain" factor to limit upscale.
optional double upscaling_factor = 14;
// The multiplicative "gain" factor to limit downscale.
optional double downscaling_factor = 15;
// How long to wait before scaling down replicas from 1 to 0
optional double downscale_to_zero_delay_s = 16;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, we shouldn't numbers assigned to existing fields, this can make the code backward incompatible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a problem because we dont store these proto records in DB and use it in different session

Comment on lines 563 to 570
autoscaling_policy: dict = Field(
default={},
description=(
"Autoscaling policy for the application. "
"If null, serve fallbacks to autoscaling policy in each deployment. "
"This option is under development and not yet supported."
),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The type hint and default value for autoscaling_policy are inconsistent with its description. The description mentions "If null...", which implies the field should be optional and can be None.

  1. The type hint should be Optional[dict] to allow None as a value.
  2. The default value should be None to match the "if null" condition described.
  3. Using default={} creates a mutable default value, which can lead to unexpected behavior. It's better to use default=None or default_factory=dict. In this case, default=None is the most clear and correct choice based on the description.
Suggested change
autoscaling_policy: dict = Field(
default={},
description=(
"Autoscaling policy for the application. "
"If null, serve fallbacks to autoscaling policy in each deployment. "
"This option is under development and not yet supported."
),
)
autoscaling_policy: Optional[dict] = Field(
default=None,
description=(
"Autoscaling policy for the application. "
"If null, serve fallbacks to autoscaling policy in each deployment. "
"This option is under development and not yet supported."
),
)

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Oct 8, 2025
@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Oct 8, 2025
@abrarsheikh abrarsheikh requested review from akyang-anyscale, harshit-anyscale and zcin and removed request for akyang-anyscale October 8, 2025 17:56
@zcin
Copy link
Contributor

zcin commented Oct 9, 2025

@abrarsheikh merge conflicts

@zcin zcin enabled auto-merge (squash) October 10, 2025 00:26
Signed-off-by: abrar <abrar@anyscale.com>
@github-actions github-actions bot disabled auto-merge October 10, 2025 01:56
cursor[bot]

This comment was marked as outdated.

Signed-off-by: abrar <abrar@anyscale.com>
@Kishanthan
Copy link
Contributor

LGTM

@zcin zcin merged commit 023e470 into master Oct 10, 2025
6 checks passed
@zcin zcin deleted the SERVE-1215-abrar-schema branch October 10, 2025 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants