-
Notifications
You must be signed in to change notification settings - Fork 7k
[serve] Fix bug with 'proxy_location' set for 'serve run' CLI command + discrepancy fix in Python API 'serve.start' function #57622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[serve] Fix bug with 'proxy_location' set for 'serve run' CLI command + discrepancy fix in Python API 'serve.start' function #57622
Conversation
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
…' method Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
3de646d to
3cdf062
Compare
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
3cdf062 to
b4c6107
Compare
| _system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50}, | ||
| ) | ||
| serve.start( | ||
| proxy_location=ProxyLocation.HeadOnly, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the current value which is used implicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirming, even if we remove this, serve.start will still use HeadOnly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was true before the change. In the current master, the result proxy_location if not provided explicitly depends on the presence of http_options in serve.start parameters. Empty http_options gives EveryNode, but non-empty http_options gives HeadOnly.
This PR changes this discrepancy. After the change serve.start runs cluster with default EveryNode proxy_location in case of empty proxy_location parameter.
See the example in the description:
- Fix discrepancy for 'proxy_location' in the Python API 'start' method
serve.startfunction in Python API sets differenthttp_options.locationdepending on ifhttp_optionsis provided.Steps to reproduce:
- have a script:
# discrepancy.py import time from ray import serve from ray.serve.context import _get_global_client if __name__ == '__main__': serve.start() client = _get_global_client() print(f"Empty http_options: `{client.http_config.location}`") serve.shutdown() time.sleep(5) serve.start(http_options={"host": "0.0.0.0"}) client = _get_global_client() print(f"Non empty http_options: `{client.http_config.location}`")Execute:
ray stop ray start --head python -m discrepancyOutput:
Before change: Empty http_options: `EveryNode` Non empty http_options: `HeadOnly` After change: Empty http_options: `EveryNode` Non empty http_options: `EveryNode`
So, I use now
serve.start(
proxy_location=ProxyLocation.HeadOnly,
to have the same behavior in tests as before the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense :)
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
b63089c to
3d44bf7
Compare
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
harshit-anyscale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
Thank you for the approval, @harshit-anyscale ! |
abrarsheikh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does serve run restarts the cluster? I am trying to understand what happens if we call serve run the very first time with location = HeadOnly, then call serve run again with location = EveryNode. How does this change percolate through the cluster.
Is the entire cluster restarted include controller, proxies? If not should be disallow changing location during the second serve run command?
python/ray/serve/_private/config.py
Outdated
| def prepare_http_options( | ||
| proxy_location: Union[None, str, ProxyLocation], | ||
| http_options: Union[None, dict, HTTPOptions], | ||
| ) -> HTTPOptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using this function to consolidate http_options, using http_options and proxy_location passed by user. Which is well intended, but the issue is that
- In imperative mode, user passes HTTPOptions
- In declarative mode user passes HTTPOptionsSchema
I think we should keep this function specific to imperative mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can remove the usage of the prepare_http_options func from scripts.py and use it in serve.start api only. Will it work?
And could you please clarify what do you mean by imperative and declarative modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imperative ->serve.run(app)
declarative -> serve deploy config.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clarification! I renamed prepare_http_options -> prepare_imperative_http_options and left a note about the distinction in the doc string.
python/ray/serve/scripts.py
Outdated
| proxy_location = None | ||
| http_options = None | ||
| grpc_options = gRPCOptions() | ||
| # Merge http_options and grpc_options with the ones on ServeDeploySchema. | ||
| if is_config and isinstance(config, ServeDeploySchema): | ||
| config_http_options = config.http_options.dict() | ||
| http_options = {**config_http_options, **http_options} | ||
| proxy_location = config.proxy_location | ||
| http_options = config.http_options.dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the core issue we are trying to solve is that serve run does not pick proxy_location from config, then is the following sufficient to fix that?
http_options = {"location": "EveryNode"}
grpc_options = gRPCOptions()
# Merge http_options and grpc_options with the ones on ServeDeploySchema.
if is_config and isinstance(config, ServeDeploySchema):
http_options["location"] = config.proxy_location
config_http_options = config.http_options.dict()
http_options = {**config_http_options, **http_options}
grpc_options = gRPCOptions(**config.grpc_options.dict())
client = _private_api.serve_start(
http_options=http_options,
.....There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost, this will be sufficient:
http_options = {"location": "EveryNode"}
grpc_options = gRPCOptions()
# Merge http_options and grpc_options with the ones on ServeDeploySchema.
if is_config and isinstance(config, ServeDeploySchema):
http_options["location"] = ProxyLocation._to_deployment_mode(config.proxy_location).value
config_http_options = config.http_options.dict()
http_options = {**config_http_options, **http_options}
grpc_options = gRPCOptions(**config.grpc_options.dict())http_options["location"] = config.proxy_location
--->
http_options["location"] = ProxyLocation._to_deployment_mode(config.proxy_location).valueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, then let's make this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied the change.
| _system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50}, | ||
| ) | ||
| serve.start( | ||
| proxy_location=ProxyLocation.HeadOnly, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirming, even if we remove this, serve.start will still use HeadOnly?
No, the
Cluster will not be restarted and continue running with However, you can't right now start cluster with the cli
Correct, I'm working on exactly that in this PR - |
…name to prepare_imperative_http_options Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
|
Hi @abrarsheikh ! I addressed comments. Could you please check? |
| _system_config={"metrics_report_interval_ms": 1000, "task_retry_delay_ms": 50}, | ||
| ) | ||
| serve.start( | ||
| proxy_location=ProxyLocation.HeadOnly, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense :)
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? 1. Fix bug with 'proxy_location' set for 'serve run' CLI command `serve run` CLI command ignores `proxy_location` from config and uses default value `EveryNode`. Steps to reproduce: - have a script: ```python # hello_world.py from ray.serve import deployment @deployment async def hello_world(): return "Hello, world!" hello_world_app = hello_world.bind() ``` Execute: ``` ray stop ray start --head serve build -o config.yaml hello_world:hello_world_app ``` - change `proxy_location` in the `config.yaml`: EveryNode -> Disabled ``` serve run config.yaml curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location' ``` Output: ``` Before change: EveryNode - but Disabled expected After change: Disabled ``` 2. Fix discrepancy for 'proxy_location' in the Python API 'start' method `serve.start` function in Python API sets different `http_options.location` depending on if `http_options` is provided. Steps to reproduce: - have a script: ```python # discrepancy.py import time from ray import serve from ray.serve.context import _get_global_client if __name__ == '__main__': serve.start() client = _get_global_client() print(f"Empty http_options: `{client.http_config.location}`") serve.shutdown() time.sleep(5) serve.start(http_options={"host": "0.0.0.0"}) client = _get_global_client() print(f"Non empty http_options: `{client.http_config.location}`") ``` Execute: ``` ray stop ray start --head python -m discrepancy ``` Output: ``` Before change: Empty http_options: `EveryNode` Non empty http_options: `HeadOnly` After change: Empty http_options: `EveryNode` Non empty http_options: `EveryNode` ``` ------------------------------------------------------------- It changes current behavior in the following ways: 1. `serve run` CLI command respects `proxy_location` parameter from config instead of using the hardcoded `EveryNode`. 2. `serve.start` function in Python API stops using the default `HeadOnly` in case of empty `proxy_location` and provided `http_options` dictionary without `location` specified. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes ray-project#1234" --> Aims to simplify changes in the PR: ray-project#56507 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run pre-commit jobs to lint the changes in this PR. ([pre-commit setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? 1. Fix bug with 'proxy_location' set for 'serve run' CLI command `serve run` CLI command ignores `proxy_location` from config and uses default value `EveryNode`. Steps to reproduce: - have a script: ```python # hello_world.py from ray.serve import deployment @deployment async def hello_world(): return "Hello, world!" hello_world_app = hello_world.bind() ``` Execute: ``` ray stop ray start --head serve build -o config.yaml hello_world:hello_world_app ``` - change `proxy_location` in the `config.yaml`: EveryNode -> Disabled ``` serve run config.yaml curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location' ``` Output: ``` Before change: EveryNode - but Disabled expected After change: Disabled ``` 2. Fix discrepancy for 'proxy_location' in the Python API 'start' method `serve.start` function in Python API sets different `http_options.location` depending on if `http_options` is provided. Steps to reproduce: - have a script: ```python # discrepancy.py import time from ray import serve from ray.serve.context import _get_global_client if __name__ == '__main__': serve.start() client = _get_global_client() print(f"Empty http_options: `{client.http_config.location}`") serve.shutdown() time.sleep(5) serve.start(http_options={"host": "0.0.0.0"}) client = _get_global_client() print(f"Non empty http_options: `{client.http_config.location}`") ``` Execute: ``` ray stop ray start --head python -m discrepancy ``` Output: ``` Before change: Empty http_options: `EveryNode` Non empty http_options: `HeadOnly` After change: Empty http_options: `EveryNode` Non empty http_options: `EveryNode` ``` ------------------------------------------------------------- It changes current behavior in the following ways: 1. `serve run` CLI command respects `proxy_location` parameter from config instead of using the hardcoded `EveryNode`. 2. `serve.start` function in Python API stops using the default `HeadOnly` in case of empty `proxy_location` and provided `http_options` dictionary without `location` specified. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes ray-project#1234" --> Aims to simplify changes in the PR: ray-project#56507 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run pre-commit jobs to lint the changes in this PR. ([pre-commit setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com>
… + discrepancy fix in Python API 'serve.start' function (ray-project#57622) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? 1. Fix bug with 'proxy_location' set for 'serve run' CLI command `serve run` CLI command ignores `proxy_location` from config and uses default value `EveryNode`. Steps to reproduce: - have a script: ```python # hello_world.py from ray.serve import deployment @deployment async def hello_world(): return "Hello, world!" hello_world_app = hello_world.bind() ``` Execute: ``` ray stop ray start --head serve build -o config.yaml hello_world:hello_world_app ``` - change `proxy_location` in the `config.yaml`: EveryNode -> Disabled ``` serve run config.yaml curl -s -X GET "http://localhost:8265/api/serve/applications/" | jq -r '.proxy_location' ``` Output: ``` Before change: EveryNode - but Disabled expected After change: Disabled ``` 2. Fix discrepancy for 'proxy_location' in the Python API 'start' method `serve.start` function in Python API sets different `http_options.location` depending on if `http_options` is provided. Steps to reproduce: - have a script: ```python # discrepancy.py import time from ray import serve from ray.serve.context import _get_global_client if __name__ == '__main__': serve.start() client = _get_global_client() print(f"Empty http_options: `{client.http_config.location}`") serve.shutdown() time.sleep(5) serve.start(http_options={"host": "0.0.0.0"}) client = _get_global_client() print(f"Non empty http_options: `{client.http_config.location}`") ``` Execute: ``` ray stop ray start --head python -m discrepancy ``` Output: ``` Before change: Empty http_options: `EveryNode` Non empty http_options: `HeadOnly` After change: Empty http_options: `EveryNode` Non empty http_options: `EveryNode` ``` ------------------------------------------------------------- It changes current behavior in the following ways: 1. `serve run` CLI command respects `proxy_location` parameter from config instead of using the hardcoded `EveryNode`. 2. `serve.start` function in Python API stops using the default `HeadOnly` in case of empty `proxy_location` and provided `http_options` dictionary without `location` specified. <!-- Please give a short summary of the change and the problem this solves. --> ## Related issue number <!-- For example: "Closes ray-project#1234" --> Aims to simplify changes in the PR: ray-project#56507 ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run pre-commit jobs to lint the changes in this PR. ([pre-commit setup](https://docs.ray.io/en/latest/ray-contribute/getting-involved.html#lint-and-formatting)) - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: axreldable <aleksei.starikov.ax@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Why are these changes needed?
serve runCLI command ignoresproxy_locationfrom config and uses default valueEveryNode.Steps to reproduce:
Execute:
proxy_locationin theconfig.yaml: EveryNode -> DisabledOutput:
serve.startfunction in Python API sets differenthttp_options.locationdepending on ifhttp_optionsis provided.Steps to reproduce:
Execute:
Output:
It changes current behavior in the following ways:
serve runCLI command respectsproxy_locationparameter from config instead of using the hardcodedEveryNode.serve.startfunction in Python API stops using the defaultHeadOnlyin case of emptyproxy_locationand providedhttp_optionsdictionary withoutlocationspecified.Related issue number
Aims to simplify changes in the PR: #56507
Checks
git commit -s) in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.