Skip to content

Conversation

@axreldable
Copy link
Contributor

@axreldable axreldable commented Oct 10, 2025

Why are these changes needed?

  1. Fix bug with 'proxy_location' set for 'serve run' CLI command

serve run CLI command ignores proxy_location from config and uses default value EveryNode.

Steps to reproduce:

  • have a script:
# hello_world.py
from ray.serve import deployment

@deployment
async def hello_world():
    return "Hello, world!"

hello_world_app = hello_world.bind()

Execute:

ray stop
ray start --head
serve build -o config.yaml hello_world:hello_world_app
  • change proxy_location in the config.yaml: EveryNode -> Disabled
serve run config.yaml
curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location'

Output:

Before change:
EveryNode - but Disabled expected
After change:
Disabled
  1. Fix discrepancy for 'proxy_location' in the Python API 'start' method

serve.start function in Python API sets different http_options.location depending on if http_options is provided.

Steps to reproduce:

  • have a script:
# discrepancy.py
import time

from ray import serve
from ray.serve.context import _get_global_client

if __name__ == '__main__':
    serve.start()
    client = _get_global_client()
    print(f"Empty http_options: `{client.http_config.location}`")

    serve.shutdown()
    time.sleep(5)

    serve.start(http_options={"host": "0.0.0.0"})
    client = _get_global_client()
    print(f"Non empty http_options: `{client.http_config.location}`")

Execute:

ray stop
ray start --head
python -m discrepancy

Output:

Before change:
Empty http_options: `EveryNode`
Non empty http_options: `HeadOnly`
After change:
Empty http_options: `EveryNode`
Non empty http_options: `EveryNode`

It changes current behavior in the following ways:

  1. serve run CLI command respects proxy_location parameter from config instead of using the hardcoded EveryNode.
  2. serve.start function in Python API stops using the default HeadOnly in case of empty proxy_location and provided http_options dictionary without location specified.

Related issue number

Aims to simplify changes in the PR: #56507

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
…' method

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
@axreldable axreldable force-pushed the 56163_http_options_5 branch 2 times, most recently from 3de646d to 3cdf062 Compare October 10, 2025 13:13
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
@axreldable axreldable force-pushed the 56163_http_options_5 branch from 3cdf062 to b4c6107 Compare October 10, 2025 16:02
_system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50},
)
serve.start(
proxy_location=ProxyLocation.HeadOnly,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the current value which is used implicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming, even if we remove this, serve.start will still use HeadOnly?

Copy link
Contributor Author

@axreldable axreldable Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was true before the change. In the current master, the result proxy_location if not provided explicitly depends on the presence of http_options in serve.start parameters. Empty http_options gives EveryNode, but non-empty http_options gives HeadOnly.

This PR changes this discrepancy. After the change serve.start runs cluster with default EveryNode proxy_location in case of empty proxy_location parameter.

See the example in the description:

  1. Fix discrepancy for 'proxy_location' in the Python API 'start' method

serve.start function in Python API sets different http_options.location depending on if http_options is provided.

Steps to reproduce:

  • have a script:
# discrepancy.py
import time

from ray import serve
from ray.serve.context import _get_global_client

if __name__ == '__main__':
    serve.start()
    client = _get_global_client()
    print(f"Empty http_options: `{client.http_config.location}`")

    serve.shutdown()
    time.sleep(5)

    serve.start(http_options={"host": "0.0.0.0"})
    client = _get_global_client()
    print(f"Non empty http_options: `{client.http_config.location}`")

Execute:

ray stop
ray start --head
python -m discrepancy

Output:

Before change:
Empty http_options: `EveryNode`
Non empty http_options: `HeadOnly`
After change:
Empty http_options: `EveryNode`
Non empty http_options: `EveryNode`

So, I use now

serve.start(
        proxy_location=ProxyLocation.HeadOnly,

to have the same behavior in tests as before the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense :)

@axreldable axreldable changed the title 56163 http options 5 [serve] Fix bug with 'proxy_location' set for 'serve run' CLI command + discrepancy fix in Python API 'serve.start' function Oct 10, 2025
@axreldable axreldable marked this pull request as ready for review October 10, 2025 19:28
@axreldable axreldable requested a review from a team as a code owner October 10, 2025 19:28
cursor[bot]

This comment was marked as outdated.

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
@axreldable axreldable force-pushed the 56163_http_options_5 branch from b63089c to 3d44bf7 Compare October 11, 2025 15:10
cursor[bot]

This comment was marked as outdated.

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
cursor[bot]

This comment was marked as outdated.

@harshit-anyscale harshit-anyscale added the go add ONLY when ready to merge, run all tests label Oct 13, 2025
Copy link
Contributor

@harshit-anyscale harshit-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@axreldable
Copy link
Contributor Author

Thank you for the approval, @harshit-anyscale !
@abrarsheikh , could you please review this as well?

Copy link
Contributor

@abrarsheikh abrarsheikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does serve run restarts the cluster? I am trying to understand what happens if we call serve run the very first time with location = HeadOnly, then call serve run again with location = EveryNode. How does this change percolate through the cluster.

Is the entire cluster restarted include controller, proxies? If not should be disallow changing location during the second serve run command?

Comment on lines 811 to 814
def prepare_http_options(
proxy_location: Union[None, str, ProxyLocation],
http_options: Union[None, dict, HTTPOptions],
) -> HTTPOptions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using this function to consolidate http_options, using http_options and proxy_location passed by user. Which is well intended, but the issue is that

  1. In imperative mode, user passes HTTPOptions
  2. In declarative mode user passes HTTPOptionsSchema

I think we should keep this function specific to imperative mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can remove the usage of the prepare_http_options func from scripts.py and use it in serve.start api only. Will it work?

And could you please clarify what do you mean by imperative and declarative modes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imperative ->serve.run(app)
declarative -> serve deploy config.yaml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification! I renamed prepare_http_options -> prepare_imperative_http_options and left a note about the distinction in the doc string.

Comment on lines 537 to 543
proxy_location = None
http_options = None
grpc_options = gRPCOptions()
# Merge http_options and grpc_options with the ones on ServeDeploySchema.
if is_config and isinstance(config, ServeDeploySchema):
config_http_options = config.http_options.dict()
http_options = {**config_http_options, **http_options}
proxy_location = config.proxy_location
http_options = config.http_options.dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the core issue we are trying to solve is that serve run does not pick proxy_location from config, then is the following sufficient to fix that?

    http_options = {"location": "EveryNode"}
    grpc_options = gRPCOptions()
    # Merge http_options and grpc_options with the ones on ServeDeploySchema.
    if is_config and isinstance(config, ServeDeploySchema):
        http_options["location"] = config.proxy_location
        config_http_options = config.http_options.dict()
        http_options = {**config_http_options, **http_options}
        grpc_options = gRPCOptions(**config.grpc_options.dict())

    client = _private_api.serve_start(
        http_options=http_options,
        .....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost, this will be sufficient:

    http_options = {"location": "EveryNode"}
    grpc_options = gRPCOptions()
    # Merge http_options and grpc_options with the ones on ServeDeploySchema.
    if is_config and isinstance(config, ServeDeploySchema):
        http_options["location"] = ProxyLocation._to_deployment_mode(config.proxy_location).value
        config_http_options = config.http_options.dict()
        http_options = {**config_http_options, **http_options}
        grpc_options = gRPCOptions(**config.grpc_options.dict())
http_options["location"] = config.proxy_location
--->
http_options["location"] = ProxyLocation._to_deployment_mode(config.proxy_location).value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, then let's make this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the change.

_system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50},
)
serve.start(
proxy_location=ProxyLocation.HeadOnly,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just confirming, even if we remove this, serve.start will still use HeadOnly?

@axreldable
Copy link
Contributor Author

axreldable commented Oct 26, 2025

does serve run restarts the cluster? I am trying to understand what happens if we call serve run the very first time with location = HeadOnly, then call serve run again with location = EveryNode. How does this change percolate through the cluster.

Is the entire cluster restarted include controller, proxies? If not should be disallow changing location during the second serve run command?


does serve run restarts the cluster?

No, the run command goes to serve._private.serve_start which starts cluster or returns existing ServeControllerClient.

I am trying to understand what happens if we call serve run the very first time with location = HeadOnly, then call serve run again with location = EveryNode. How does this change percolate through the cluster.

Cluster will not be restarted and continue running with location = HeadOnly. Info and warning messages will be logged about the attempt to change http options for the cluster.

However, you can't right now start cluster with the cli run command with location = HeadOnly configured in config file. It will be overridden with EveryNode. This PR fixes it.

Is the entire cluster restarted include controller, proxies? If not should be disallow changing location during the second serve run command?

Correct, I'm working on exactly that in this PR - Fail on the change of 'proxy_location' or 'http_options' parameters for the 'serve' API. To simplify the change, I opened this PR to resolve the bug first. As the logic became too complicated there with this bug and discrepancy.

@axreldable
Copy link
Contributor Author

Hi @abrarsheikh ! I addressed comments. Could you please check?

_system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50},
)
serve.start(
proxy_location=ProxyLocation.HeadOnly,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense :)

@abrarsheikh abrarsheikh merged commit 92d8471 into ray-project:master Oct 29, 2025
6 checks passed
@axreldable axreldable deleted the 56163_http_options_5 branch October 29, 2025 18:38
YoussefEssDS pushed a commit to YoussefEssDS/ray that referenced this pull request Nov 8, 2025
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

1. Fix bug with 'proxy_location' set for 'serve run' CLI command

`serve run` CLI command ignores `proxy_location` from config and uses
default value `EveryNode`.

Steps to reproduce:
- have a script:
```python
# hello_world.py
from ray.serve import deployment

@deployment
async def hello_world():
    return "Hello, world!"

hello_world_app = hello_world.bind()
```
Execute:
```
ray stop
ray start --head
serve build -o config.yaml hello_world:hello_world_app
```
- change `proxy_location` in the `config.yaml`: EveryNode -> Disabled
```
serve run config.yaml
curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location'
```
Output:
```
Before change:
EveryNode - but Disabled expected
After change:
Disabled
```

2. Fix discrepancy for 'proxy_location' in the Python API 'start' method

`serve.start` function in Python API sets different
`http_options.location` depending on if `http_options` is provided.

 Steps to reproduce:
- have a script:
```python
# discrepancy.py
import time

from ray import serve
from ray.serve.context import _get_global_client

if __name__ == '__main__':
    serve.start()
    client = _get_global_client()
    print(f"Empty http_options: `{client.http_config.location}`")

    serve.shutdown()
    time.sleep(5)

    serve.start(http_options={"host": "0.0.0.0"})
    client = _get_global_client()
    print(f"Non empty http_options: `{client.http_config.location}`")
```
Execute:
```
ray stop
ray start --head
python -m discrepancy
```
Output:
```
Before change:
Empty http_options: `EveryNode`
Non empty http_options: `HeadOnly`
After change:
Empty http_options: `EveryNode`
Non empty http_options: `EveryNode`
```

-------------------------------------------------------------
It changes current behavior in the following ways:
1. `serve run` CLI command respects `proxy_location` parameter from
config instead of using the hardcoded `EveryNode`.
2. `serve.start` function in Python API stops using the default
`HeadOnly` in case of empty `proxy_location` and provided `http_options`
dictionary without `location` specified.

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

Aims to simplify changes in the PR: ray-project#56507

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run pre-commit jobs to lint the changes in this PR.
([pre-commit
setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting))
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

1. Fix bug with 'proxy_location' set for 'serve run' CLI command

`serve run` CLI command ignores `proxy_location` from config and uses
default value `EveryNode`.

Steps to reproduce:
- have a script:
```python
# hello_world.py
from ray.serve import deployment

@deployment
async def hello_world():
    return "Hello, world!"

hello_world_app = hello_world.bind()
```
Execute:
```
ray stop
ray start --head
serve build -o config.yaml hello_world:hello_world_app
```
- change `proxy_location` in the `config.yaml`: EveryNode -> Disabled
```
serve run config.yaml
curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location'
```
Output:
```
Before change:
EveryNode - but Disabled expected
After change:
Disabled
```

2. Fix discrepancy for 'proxy_location' in the Python API 'start' method

`serve.start` function in Python API sets different
`http_options.location` depending on if `http_options` is provided.

 Steps to reproduce:
- have a script:
```python
# discrepancy.py
import time

from ray import serve
from ray.serve.context import _get_global_client

if __name__ == '__main__':
    serve.start()
    client = _get_global_client()
    print(f"Empty http_options: `{client.http_config.location}`")

    serve.shutdown()
    time.sleep(5)

    serve.start(http_options={"host": "0.0.0.0"})
    client = _get_global_client()
    print(f"Non empty http_options: `{client.http_config.location}`")
```
Execute:
```
ray stop
ray start --head
python -m discrepancy
```
Output:
```
Before change:
Empty http_options: `EveryNode`
Non empty http_options: `HeadOnly`
After change:
Empty http_options: `EveryNode`
Non empty http_options: `EveryNode`
```

-------------------------------------------------------------
It changes current behavior in the following ways:
1. `serve run` CLI command respects `proxy_location` parameter from
config instead of using the hardcoded `EveryNode`.
2. `serve.start` function in Python API stops using the default
`HeadOnly` in case of empty `proxy_location` and provided `http_options`
dictionary without `location` specified.

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

Aims to simplify changes in the PR: ray-project#56507

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run pre-commit jobs to lint the changes in this PR.
([pre-commit
setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting))
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622)

<!-- Thank you for your contribution! Please review
https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before
opening a pull request. -->

<!-- Please add a reviewer to the assignee section when you create a PR.
If you don't have the access to it, we will shortly find a reviewer and
assign them to your PR. -->

## Why are these changes needed?

1. Fix bug with 'proxy_location' set for 'serve run' CLI command

`serve run` CLI command ignores `proxy_location` from config and uses
default value `EveryNode`.

Steps to reproduce:
- have a script:
```python
# hello_world.py
from ray.serve import deployment

@deployment
async def hello_world():
    return "Hello, world!"

hello_world_app = hello_world.bind()
```
Execute:
```
ray stop
ray start --head
serve build -o config.yaml hello_world:hello_world_app
```
- change `proxy_location` in the `config.yaml`: EveryNode -> Disabled
```
serve run config.yaml
curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location'
```
Output:
```
Before change:
EveryNode - but Disabled expected
After change:
Disabled
```

2. Fix discrepancy for 'proxy_location' in the Python API 'start' method

`serve.start` function in Python API sets different
`http_options.location` depending on if `http_options` is provided.

 Steps to reproduce:
- have a script:
```python
# discrepancy.py
import time

from ray import serve
from ray.serve.context import _get_global_client

if __name__ == '__main__':
    serve.start()
    client = _get_global_client()
    print(f"Empty http_options: `{client.http_config.location}`")

    serve.shutdown()
    time.sleep(5)

    serve.start(http_options={"host": "0.0.0.0"})
    client = _get_global_client()
    print(f"Non empty http_options: `{client.http_config.location}`")
```
Execute:
```
ray stop
ray start --head
python -m discrepancy
```
Output:
```
Before change:
Empty http_options: `EveryNode`
Non empty http_options: `HeadOnly`
After change:
Empty http_options: `EveryNode`
Non empty http_options: `EveryNode`
```

-------------------------------------------------------------
It changes current behavior in the following ways:
1. `serve run` CLI command respects `proxy_location` parameter from
config instead of using the hardcoded `EveryNode`.
2. `serve.start` function in Python API stops using the default
`HeadOnly` in case of empty `proxy_location` and provided `http_options`
dictionary without `location` specified.

<!-- Please give a short summary of the change and the problem this
solves. -->

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

Aims to simplify changes in the PR: ray-project#56507

## Checks

- [x] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [x] I've run pre-commit jobs to lint the changes in this PR.
([pre-commit
setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting))
- [x] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [x] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

---------

Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants