[doc][serve][llm] Model loading Docs #57922

ahao-anyscale · 2025-10-20T19:22:11Z

Description

Documents best practices and strategies to decrease autoscaling time for serve llm replicas.

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

doc/source/serve/llm/user-guides/model-loading.md

doc/source/serve/llm/user-guides/deployment-initialization.md

doc/source/serve/llm/user-guides/model-loading.md

doc/source/serve/llm/user-guides/deployment-initialization.md

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

kouroshHakha

cool, need to wait until landing: https://github.com/anyscale/ray-serve-llm-perf-examples/tree/master/replica_initialization

In the meantime please address the comments below

doc/source/serve/llm/user-guides/deployment-initialization.md

kouroshHakha · 2025-10-23T18:31:03Z

doc/source/serve/llm/user-guides/deployment-initialization.md

+```
+
+In this example the cache folder is located at `/home/ray/.cache/vllm/torch_compile_cache/131ee5c6d9`.
+Simply upload this file to your S3 bucket. The cache folder can now be retrieved at startup. We provide a custom utility to download the compile cache from cloud storage. Simply specify the `CloudDownloader` callback in `LLMConfig` and supply the relevant arguments. Make sure to set the `cache_dir` in compilation_config correctly. 


Signed-off-by: ahao-anyscale <ahao@anyscale.com>

Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>

ahao-anyscale added 2 commits October 20, 2025 12:14

added docs

07b5c0e

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

fix numbers

fcfb847

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

kouroshHakha requested changes Oct 20, 2025

View reviewed changes

ahao-anyscale added 2 commits October 20, 2025 16:00

redo model loading doc

1931623

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

header fix

64468a9

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

ahao-anyscale marked this pull request as ready for review October 21, 2025 16:53

ahao-anyscale requested review from a team as code owners October 21, 2025 16:53

ray-gardener bot added serve Ray Serve Related Issue docs An issue or change related to documentation llm labels Oct 21, 2025

ahao-anyscale added 3 commits October 21, 2025 15:02

Merge branch 'master' into model-loading-docs

3c5913c

move benchmarks to new file

8f50c25

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

removed benchmarks

b7bfcaa

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

kouroshHakha reviewed Oct 23, 2025

View reviewed changes

ahao-anyscale added 2 commits October 23, 2025 18:07

doc edits

89baa73

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

Update deployment-initialization.md

18c9ff0

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

kouroshHakha approved these changes Oct 24, 2025

View reviewed changes

kouroshHakha changed the title ~~[serve][llm] Model loading Docs~~ [doc][serve][llm] Model loading Docs Oct 24, 2025

kouroshHakha enabled auto-merge (squash) October 24, 2025 22:48

github-actions bot added the go add ONLY when ready to merge, run all tests label Oct 24, 2025

kouroshHakha merged commit a1cf87c into ray-project:master Oct 24, 2025
7 of 8 checks passed

xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 27, 2025

[doc][serve][llm] Model loading Docs (ray-project#57922)

73a0029

Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>

landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025

[doc][serve][llm] Model loading Docs (ray-project#57922)

ad94827

Signed-off-by: ahao-anyscale <ahao@anyscale.com>

Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025

[doc][serve][llm] Model loading Docs (ray-project#57922)

60c7606

Signed-off-by: ahao-anyscale <ahao@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[doc][serve][llm] Model loading Docs #57922

[doc][serve][llm] Model loading Docs #57922

ahao-anyscale commented Oct 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[doc][serve][llm] Model loading Docs #57922

[doc][serve][llm] Model loading Docs #57922

Conversation

ahao-anyscale commented Oct 20, 2025

Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants