-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Migrate docs from Sphinx to MkDocs #18145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
115 commits
Select commit
Hold shift + click to select a range
da2d299
temporary gitignore
hmellor 6c861ca
first commit
hmellor af3b60d
Add missing requirement
hmellor 12ec865
Handle list tables
hmellor 58ee036
Show all examples
hmellor 3f3572d
Change structure to look more like final structure
hmellor c9ee202
Update gitignore
hmellor 4de42f1
Don't blindly copy all
hmellor 40fdcec
Remove index pages which add nothing
hmellor 8047a0a
Handle image blocks
hmellor 4b4e177
Only transpile files with extensions
hmellor f85f172
Remove unneeded TOCs from index
hmellor 19047ba
Remove another unneeded index
hmellor 2509639
We don't do toctrees anymore
hmellor 0a7c43d
Generate examples using mkdocs format
hmellor 53f055f
Handle TOC
hmellor 812bd8c
More nav improvements
hmellor 32fab12
Mark image as handled
hmellor 1df088a
Handle GitHub URL schemes
hmellor 2188496
Fix RunLLM widget
hmellor 42af380
Update `.readthedocs.yaml`
hmellor 5fd0c33
Adjust `.nav.yml`
hmellor a4a125a
Enable emoji for GitHub links
hmellor 65a3cc9
Tweak
hmellor 1ded939
Use transpile docs as a hook
hmellor dd16e08
Merge branch 'main' into mkdocs
hmellor 9e84fe2
Add missing requirement for transpiling
hmellor f46e2d5
Remove mystery indent
hmellor 7a460ce
Fix styling for feature tables
hmellor 50e4896
Fix front-matter titles
hmellor d6b9635
Make installation index a readme
hmellor 215366c
Update themes
hmellor 77c2554
Ignore template and inc in docs
hmellor ec284e7
Organise nav for getting started
hmellor f28c9f6
Use dataclass for blocks and handle includes
hmellor 0961b55
Update snippet tags in docs
hmellor 09819d3
Handle `alt` from images
hmellor 5385eb6
Update installation readme
hmellor 1c3c4ad
Remove custom CSS that's no longer used
hmellor ceb799f
Add warning for include with extra attrs
hmellor cb03d40
Finish literalinclude
hmellor bc70c9d
Support math and code blocks
hmellor 11cf31a
Fix includes
hmellor 1a7a691
Rename all `index` files to `README`
hmellor 7c5539c
Handle toctrees
hmellor 7a2ead2
Clean up diff
hmellor d388a24
Cleanup diff 2
hmellor a47f141
Remove unnecessary leading `./`
hmellor 230d0cd
Document new docs building
hmellor 192a551
Remove index stuff from `gen_examples.py`
hmellor f351fe2
Formatting
hmellor 7a274d9
Add API reference
hmellor 85d45a4
Remove inline MyST code syntax
hmellor 02f3a65
Fix API ref readme
hmellor a44232a
Fix code cross references
hmellor 974c4ab
Fix admonitions in docstrings
hmellor 0f92d8f
Comment out unhandled blocks
hmellor 302556d
Fix figure in LLMEngine.step
hmellor 77e6d8e
Remove argparse from CLI ref (tell user to run --help)
hmellor 4c16f60
Remove unnecessary section in top level readme
hmellor a5a45cc
Fix mkdocs anchor syntax
hmellor 23f6e46
Merge branch 'main' into mkdocs
hmellor d643c4a
Fix engine args page
hmellor d8a4e90
Tweak links
hmellor dfa3c30
Add latest warning to announcement banner
hmellor 68209e0
Fix LMCache capitalisation
hmellor f184c0c
Fix announcement bar
hmellor 3643ce0
Fix on startup hook
hmellor 7740e35
Enable some search features
hmellor 0406086
Improve headings in API ref
hmellor 6387b4c
Transpile twice, once to find all the explicit anchors, and once to u…
hmellor 6429083
Improve API ref
hmellor 916e087
Let transpiler handle these links
hmellor a85e0d0
Merge branch 'main' into mkdocs
hmellor cb9fcaa
Reduce search rank of API ref
hmellor 49b2069
Reduce repetition in API docs
hmellor ec40428
Simplify `.nav.yml`
hmellor 2b736ab
Workaround no longer needed
hmellor dc13d59
Restructure `.nav.yml`
hmellor b80c71c
Revert "Workaround no longer needed"
hmellor 300cb81
Fix absolute image paths
hmellor 5ce330a
Fix blog nav
hmellor 99cba62
Fix URL scheme titles
hmellor f6853e0
Fix straggling `project:` links
hmellor bb469d2
Transpiler improvement
hmellor 30eafd5
Merge branch 'main' into mkdocs
hmellor b35bdcd
Fix confusing headings in API ref
hmellor 9e4196b
Tidy extra mkdocs files
hmellor ab3abed
Make API ref nav slightly better
hmellor 6ff269f
Merge branch 'main' into mkdocs
hmellor 22ee168
Commit transpile output
hmellor 680562f
Remove transpile hook from config
hmellor 7c2ce70
Fix gitignore for examples
hmellor 4502dfb
Fix some whitespace from transpile
hmellor 331664c
Fix url schemes
hmellor 8a44a66
Make pre-commit happy
hmellor 8d8cf0f
update title for home page in nav
hmellor 596e07e
Fix double newline
hmellor dd66626
Merge branch 'main' into mkdocs
hmellor b4c2e75
Tabulate not needed now that we're not transpiling
hmellor 0543aaa
Merge branch 'main' into mkdocs
hmellor c33a510
Review comments
hmellor 07351e2
Fix pre-commit
hmellor 94e88af
Update `Documentation Build`
hmellor 1f81b6e
Merge branch 'main' into mkdocs
hmellor a67d1cc
Add FalconH1 back to supported models list
hmellor 105370c
Revert change to Dockerfile
hmellor 56923ad
Fix typo
hmellor 982184b
Merge branch 'main' into mkdocs
hmellor 29f7267
Docs build needs the examples too
hmellor 08fa15c
Make pre-commit happy
hmellor 06d9b72
Merge branch 'main' into mkdocs
hmellor c57a89d
Merge branch 'main' into mkdocs
hmellor fe24554
Merge branch 'main' into mkdocs
hmellor 7e8c725
Merge branch 'main' into mkdocs
hmellor File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
| nav: | ||
| - Home: | ||
| - vLLM: README.md | ||
| - Getting Started: | ||
| - getting_started/quickstart.md | ||
| - getting_started/installation | ||
| - Examples: | ||
| - LMCache: getting_started/examples/lmcache | ||
| - getting_started/examples/offline_inference | ||
| - getting_started/examples/online_serving | ||
| - getting_started/examples/other | ||
| - Roadmap: https://roadmap.vllm.ai | ||
| - Releases: https://github.com/vllm-project/vllm/releases | ||
| - User Guide: | ||
| - Inference and Serving: | ||
| - serving/offline_inference.md | ||
| - serving/openai_compatible_server.md | ||
| - serving/* | ||
| - serving/integrations | ||
| - Training: training | ||
| - Deployment: | ||
| - deployment/* | ||
| - deployment/frameworks | ||
| - deployment/integrations | ||
| - Performance: performance | ||
| - Models: | ||
| - models/supported_models.md | ||
| - models/generative_models.md | ||
| - models/pooling_models.md | ||
| - models/extensions | ||
| - Features: | ||
| - features/compatibility_matrix.md | ||
| - features/* | ||
| - features/quantization | ||
| - Other: | ||
| - getting_started/* | ||
| - Developer Guide: | ||
| - contributing/overview.md | ||
| - glob: contributing/* | ||
| flatten_single_child_sections: true | ||
| - contributing/model | ||
| - Design Documents: | ||
| - V0: design | ||
| - V1: design/v1 | ||
| - API Reference: | ||
| - api/README.md | ||
| - glob: api/vllm/* | ||
| preserve_directory_names: true | ||
| - Community: | ||
| - community/* | ||
| - vLLM Blog: https://blog.vllm.ai |
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,43 +1,50 @@ | ||
| # vLLM documents | ||
|
|
||
| ## Build the docs | ||
|
|
||
| - Make sure in `docs` directory | ||
|
|
||
| ```bash | ||
| cd docs | ||
| ``` | ||
|
|
||
| - Install the dependencies: | ||
|
|
||
| ```bash | ||
| pip install -r ../requirements/docs.txt | ||
| ``` | ||
|
|
||
| - Clean the previous build (optional but recommended): | ||
|
|
||
| ```bash | ||
| make clean | ||
| ``` | ||
|
|
||
| - Generate the HTML documentation: | ||
|
|
||
| ```bash | ||
| make html | ||
| ``` | ||
|
|
||
| ## Open the docs with your browser | ||
|
|
||
| - Serve the documentation locally: | ||
|
|
||
| ```bash | ||
| python -m http.server -d build/html/ | ||
| ``` | ||
|
|
||
| This will start a local server at http://localhost:8000. You can now open your browser and view the documentation. | ||
|
|
||
| If port 8000 is already in use, you can specify a different port, for example: | ||
|
|
||
| ```bash | ||
| python -m http.server 3000 -d build/html/ | ||
| ``` | ||
| # Welcome to vLLM | ||
|
|
||
| <figure markdown="span"> | ||
| { align="center" alt="vLLM" class="no-scaled-link" width="60%" } | ||
| </figure> | ||
|
|
||
| <p style="text-align:center"> | ||
| <strong>Easy, fast, and cheap LLM serving for everyone | ||
| </strong> | ||
| </p> | ||
|
|
||
| <p style="text-align:center"> | ||
| <script async defer src="https://buttons.github.io/buttons.js"></script> | ||
| <a class="github-button" href="https://github.com/vllm-project/vllm" data-show-count="true" data-size="large" aria-label="Star">Star</a> | ||
| <a class="github-button" href="https://github.com/vllm-project/vllm/subscription" data-icon="octicon-eye" data-size="large" aria-label="Watch">Watch</a> | ||
| <a class="github-button" href="https://github.com/vllm-project/vllm/fork" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork">Fork</a> | ||
| </p> | ||
|
|
||
| vLLM is a fast and easy-to-use library for LLM inference and serving. | ||
|
|
||
| Originally developed in the [Sky Computing Lab](https://sky.cs.berkeley.edu) at UC Berkeley, vLLM has evolved into a community-driven project with contributions from both academia and industry. | ||
|
|
||
| vLLM is fast with: | ||
|
|
||
| - State-of-the-art serving throughput | ||
| - Efficient management of attention key and value memory with [**PagedAttention**](https://blog.vllm.ai/2023/06/20/vllm.html) | ||
| - Continuous batching of incoming requests | ||
| - Fast model execution with CUDA/HIP graph | ||
| - Quantization: [GPTQ](https://arxiv.org/abs/2210.17323), [AWQ](https://arxiv.org/abs/2306.00978), INT4, INT8, and FP8 | ||
| - Optimized CUDA kernels, including integration with FlashAttention and FlashInfer. | ||
| - Speculative decoding | ||
| - Chunked prefill | ||
|
|
||
| vLLM is flexible and easy to use with: | ||
|
|
||
| - Seamless integration with popular HuggingFace models | ||
| - High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more | ||
| - Tensor parallelism and pipeline parallelism support for distributed inference | ||
| - Streaming outputs | ||
| - OpenAI-compatible API server | ||
| - Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs, Gaudi® accelerators and GPUs, IBM Power CPUs, TPU, and AWS Trainium and Inferentia Accelerators. | ||
| - Prefix caching support | ||
| - Multi-lora support | ||
|
|
||
| For more information, check out the following: | ||
|
|
||
| - [vLLM announcing blog post](https://vllm.ai) (intro to PagedAttention) | ||
| - [vLLM paper](https://arxiv.org/abs/2309.06180) (SOSP 2023) | ||
| - [How continuous batching enables 23x throughput in LLM inference while reducing p50 latency](https://www.anyscale.com/blog/continuous-batching-llm-inference) by Cade Daniel et al. | ||
| - [vLLM Meetups][meetups] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| # Summary | ||
|
|
||
| [](){ #configuration } | ||
|
|
||
| ## Configuration | ||
|
|
||
| API documentation for vLLM's configuration classes. | ||
|
|
||
| - [vllm.config.ModelConfig][] | ||
| - [vllm.config.CacheConfig][] | ||
| - [vllm.config.TokenizerPoolConfig][] | ||
| - [vllm.config.LoadConfig][] | ||
| - [vllm.config.ParallelConfig][] | ||
| - [vllm.config.SchedulerConfig][] | ||
| - [vllm.config.DeviceConfig][] | ||
| - [vllm.config.SpeculativeConfig][] | ||
| - [vllm.config.LoRAConfig][] | ||
| - [vllm.config.PromptAdapterConfig][] | ||
| - [vllm.config.MultiModalConfig][] | ||
| - [vllm.config.PoolerConfig][] | ||
| - [vllm.config.DecodingConfig][] | ||
| - [vllm.config.ObservabilityConfig][] | ||
| - [vllm.config.KVTransferConfig][] | ||
| - [vllm.config.CompilationConfig][] | ||
| - [vllm.config.VllmConfig][] | ||
|
|
||
| [](){ #offline-inference-api } | ||
|
|
||
| ## Offline Inference | ||
|
|
||
| LLM Class. | ||
|
|
||
| - [vllm.LLM][] | ||
|
|
||
| LLM Inputs. | ||
|
|
||
| - [vllm.inputs.PromptType][] | ||
| - [vllm.inputs.TextPrompt][] | ||
| - [vllm.inputs.TokensPrompt][] | ||
|
|
||
| ## vLLM Engines | ||
|
|
||
| Engine classes for offline and online inference. | ||
|
|
||
| - [vllm.LLMEngine][] | ||
| - [vllm.AsyncLLMEngine][] | ||
|
|
||
| ## Inference Parameters | ||
|
|
||
| Inference parameters for vLLM APIs. | ||
|
|
||
| [](){ #sampling-params } | ||
| [](){ #pooling-params } | ||
|
|
||
| - [vllm.SamplingParams][] | ||
| - [vllm.PoolingParams][] | ||
|
|
||
| [](){ #multi-modality } | ||
|
|
||
| ## Multi-Modality | ||
|
|
||
| vLLM provides experimental support for multi-modal models through the [vllm.multimodal][] package. | ||
|
|
||
| Multi-modal inputs can be passed alongside text and token prompts to [supported models][supported-mm-models] | ||
| via the `multi_modal_data` field in [vllm.inputs.PromptType][]. | ||
|
|
||
| Looking to add your own multi-modal model? Please follow the instructions listed [here][supports-multimodal]. | ||
|
|
||
| - [vllm.multimodal.MULTIMODAL_REGISTRY][] | ||
|
|
||
| ### Inputs | ||
|
|
||
| User-facing inputs. | ||
|
|
||
| - [vllm.multimodal.inputs.MultiModalDataDict][] | ||
|
|
||
| Internal data structures. | ||
|
|
||
| - [vllm.multimodal.inputs.PlaceholderRange][] | ||
| - [vllm.multimodal.inputs.NestedTensors][] | ||
| - [vllm.multimodal.inputs.MultiModalFieldElem][] | ||
| - [vllm.multimodal.inputs.MultiModalFieldConfig][] | ||
| - [vllm.multimodal.inputs.MultiModalKwargsItem][] | ||
| - [vllm.multimodal.inputs.MultiModalKwargs][] | ||
| - [vllm.multimodal.inputs.MultiModalInputs][] | ||
|
|
||
| ### Data Parsing | ||
|
|
||
| - [vllm.multimodal.parse][] | ||
|
|
||
| ### Data Processing | ||
|
|
||
| - [vllm.multimodal.processing][] | ||
|
|
||
| ### Memory Profiling | ||
|
|
||
| - [vllm.multimodal.profiling][] | ||
|
|
||
| ### Registry | ||
|
|
||
| - [vllm.multimodal.registry][] | ||
|
|
||
| ## Model Development | ||
|
|
||
| - [vllm.model_executor.models.interfaces_base][] | ||
| - [vllm.model_executor.models.interfaces][] | ||
| - [vllm.model_executor.models.adapters][] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| search: | ||
| boost: 0.5 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes
File renamed without changes
File renamed without changes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we preserve some instructions on how to contribute and test documentation changes after this migration to MkDocs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, information about building the docs has moved to https://docs.vllm.ai/en/latest/contributing/overview.html#building-the-docs