Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Various polishing. #4002

Merged
merged 3 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 15 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,27 +26,27 @@

----
:fire: *News* :fire:
- [Sep, 2024] Point, Lanuch and Serve **Llama 3.2** on on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
- [Sep, 2024] Run and deploy [Pixtral](./llm/pixtral), the first open-source multimodal model from Mistral AI.
- [Jul, 2024] [Finetune](./llm/llama-3_1-finetuning/) and [serve](./llm/llama-3_1/) **Llama 3.1** on your infra
- [Sep, 2024] Point, Launch and Serve **Llama 3.2** on on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
- [Sep, 2024] Run and deploy [**Pixtral**](./llm/pixtral), the first open-source multimodal model from Mistral AI.
- [Jul, 2024] [**Finetune**](./llm/llama-3_1-finetuning/) and [**serve**](./llm/llama-3_1/) **Llama 3.1** on your infra
- [Jun, 2024] Reproduce **GPT** with [llm.c](https://github.com/karpathy/llm.c/discussions/481) on any cloud: [**guide**](./llm/gpt-2/)
- [Apr, 2024] Serve [**Qwen-110B**](https://qwenlm.github.io/blog/qwen1.5-110b/) on your infra: [**example**](./llm/qwen/)
- [Apr, 2024] Using [**Ollama**](https://github.com/ollama/ollama) to deploy quantized LLMs on CPUs and GPUs: [**example**](./llm/ollama/)
- [Feb, 2024] Deploying and scaling [**Gemma**](https://blog.google/technology/developers/gemma-open-models/) with SkyServe: [**example**](./llm/gemma/)
- [Feb, 2024] Serving [**Code Llama 70B**](https://ai.meta.com/blog/code-llama-large-language-model-coding/) with vLLM and SkyServe: [**example**](./llm/codellama/)
- [Dec, 2023] [**Mixtral 8x7B**](https://mistral.ai/news/mixtral-of-experts/), a high quality sparse mixture-of-experts model, was released by Mistral AI! Deploy via SkyPilot on any cloud: [**example**](./llm/mixtral/)
- [Nov, 2023] Using [**Axolotl**](https://github.com/OpenAccess-AI-Collective/axolotl) to finetune Mistral 7B on the cloud (on-demand and spot): [**example**](./llm/axolotl/)
- [Sep, 2023] Case study: [**Covariant**](https://covariant.ai/) transformed AI development on the cloud using SkyPilot, delivering models 4x faster cost-effectively: [**read the case study**](https://blog.skypilot.co/covariant/)
- [Aug, 2023] **Finetuning Cookbook**: Finetuning Llama 2 in your own cloud environment, privately: [**example**](./llm/vicuna-llama-2/), [**blog post**](https://blog.skypilot.co/finetuning-llama2-operational-guide/)
- [Apr, 2024] Serve **Qwen-110B** on your infra: [**example**](./llm/qwen/)
- [Apr, 2024] Using **Ollama** to deploy quantized LLMs on CPUs and GPUs: [**example**](./llm/ollama/)
- [Feb, 2024] Deploying and scaling **Gemma** with SkyServe: [**example**](./llm/gemma/)
- [Feb, 2024] Serving **Code Llama 70B** with vLLM and SkyServe: [**example**](./llm/codellama/)
- [Dec, 2023] **Mixtral 8x7B**, a high quality sparse mixture-of-experts model, was released by Mistral AI! Deploy via SkyPilot on any cloud: [**example**](./llm/mixtral/)
- [Nov, 2023] Using **Axolotl** to finetune Mistral 7B on the cloud (on-demand and spot): [**example**](./llm/axolotl/)

<details>
<summary>Archived</summary>

- [Apr, 2024] Serve and finetune [**Llama 3**](https://skypilot.readthedocs.io/en/latest/gallery/llms/llama-3.html) on any cloud or Kubernetes: [**example**](./llm/llama-3/)
- [Mar, 2024] Serve and deploy [**Databricks DBRX**](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm) on your infra: [**example**](./llm/dbrx/)
- [Feb, 2024] Speed up your LLM deployments with [**SGLang**](https://github.com/sgl-project/sglang) for 5x throughput on SkyServe: [**example**](./llm/sglang/)
- [Dec, 2023] Using [**LoRAX**](https://github.com/predibase/lorax) to serve 1000s of finetuned LLMs on a single instance in the cloud: [**example**](./llm/lorax/)
- [Sep, 2023] [**Mistral 7B**](https://mistral.ai/news/announcing-mistral-7b/), a high-quality open LLM, was released! Deploy via SkyPilot on any cloud: [**Mistral docs**](https://docs.mistral.ai/self-deployment/skypilot)
- [Sep, 2023] Case study: [**Covariant**](https://covariant.ai/) transformed AI development on the cloud using SkyPilot, delivering models 4x faster cost-effectively: [**read the case study**](https://blog.skypilot.co/covariant/)
- [Aug, 2023] **Finetuning Cookbook**: Finetuning Llama 2 in your own cloud environment, privately: [**example**](./llm/vicuna-llama-2/), [**blog post**](https://blog.skypilot.co/finetuning-llama2-operational-guide/)
- [July, 2023] Self-Hosted **Llama-2 Chatbot** on Any Cloud: [**example**](./llm/llama-2/)
- [June, 2023] Serving LLM 24x Faster On the Cloud [**with vLLM**](https://vllm.ai/) and SkyPilot: [**example**](./llm/vllm/), [**blog post**](https://blog.skypilot.co/serving-llm-24x-faster-on-the-cloud-with-vllm-and-skypilot/)
- [April, 2023] [SkyPilot YAMLs](./llm/vicuna/) for finetuning & serving the [Vicuna LLM](https://lmsys.org/blog/2023-03-30-vicuna/) with a single command!
Expand Down Expand Up @@ -153,11 +153,12 @@ SkyPilot then performs the heavy-lifting for you, including:
Refer to [Quickstart](https://skypilot.readthedocs.io/en/latest/getting-started/quickstart.html) to get started with SkyPilot.

## More Information
To learn more, see our [Documentation](https://skypilot.readthedocs.io/en/latest/) and [Tutorials](https://github.com/skypilot-org/skypilot-tutorial).
To learn more, see our [documentation](https://skypilot.readthedocs.io/en/latest/), [blog](https://blog.skypilot.co/), and [community integrations](https://blog.skypilot.co/community/).

<!-- Keep this section in sync with index.rst in SkyPilot Docs -->
Runnable examples:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding our blog and community integration page link to the sentence above as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- LLMs on SkyPilot
- [Llama 3.2: lightweight and vision models](./llm/llama-3_2/)
- [Pixtral](./llm/pixtral/)
- [Llama 3.1 finetuning](./llm/llama-3_1-finetuning/) and [serving](./llm/llama-3_1/)
- [GPT-2 via `llm.c`](./llm/gpt-2/)
Expand Down Expand Up @@ -203,4 +204,4 @@ We are excited to hear your feedback!
For general discussions, join us on the [SkyPilot Slack](http://slack.skypilot.co).

## Contributing
We welcome and value all contributions to the project! Please refer to [CONTRIBUTING](CONTRIBUTING.md) for how to get involved.
We welcome all contributions to the project! See [CONTRIBUTING](CONTRIBUTING.md) for how to get involved.
14 changes: 7 additions & 7 deletions docs/source/_gallery_original/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,17 @@ Contents
:maxdepth: 1
:caption: LLM Models

Vision Llama 3.2 (Meta) <llms/llama-3_2>
Llama 3.1 (Meta) <llms/llama-3_1>
Llama 3 (Meta) <llms/llama-3>
Llama 2 (Meta) <llms/llama-2>
CodeLlama (Meta) <llms/codellama>
Pixtral (Mistral AI) <llms/pixtral>
Mixtral (Mistral AI) <llms/mixtral>
Mistral 7B (Mistral AI) <https://docs.mistral.ai/self-deployment/skypilot/>
DBRX (Databricks) <llms/dbrx>
Llama-2 (Meta) <llms/llama-2>
Llama-3 (Meta) <llms/llama-3>
Llama-3.1 (Meta) <llms/llama-3_1>
Vision Llama-3.2 (Meta) <llms/llama-3_2>
Qwen (Alibaba) <llms/qwen>
CodeLlama (Meta) <llms/codellama>
Qwen 2.5 (Alibaba) <llms/qwen>
Gemma (Google) <llms/gemma>
DBRX (Databricks) <llms/dbrx>

.. toctree::
:maxdepth: 1
Expand Down
4 changes: 2 additions & 2 deletions docs/source/_static/custom.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@ document.addEventListener('DOMContentLoaded', () => {
const newItems = [
{ selector: '.caption-text', text: 'SkyServe: Model Serving' },
{ selector: '.toctree-l1 > a', text: 'Managed Jobs' },
{ selector: '.toctree-l1 > a', text: 'Llama-3.1 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Pixtral (Mistral AI)' },
{ selector: '.toctree-l1 > a', text: 'Many Parallel Jobs' },
{ selector: '.toctree-l1 > a', text: 'Reserved, Capacity Blocks, DWS' },
{ selector: '.toctree-l1 > a', text: 'Llama-3.2 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Llama 3.2 (Meta)' },
{ selector: '.toctree-l1 > a', text: 'Admin Policy Enforcement' },
];
newItems.forEach(({ selector, text }) => {
document.querySelectorAll(selector).forEach((el) => {
Expand Down
6 changes: 2 additions & 4 deletions docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,27 +80,25 @@ Runnable examples:

* **LLMs on SkyPilot**

* `Llama 3.2: lightweight and vision models <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3_2>`_
* `Pixtral <https://github.com/skypilot-org/skypilot/tree/master/llm/pixtral>`_
* `Llama 3.1 finetuning <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3_1-finetuning>`_ and `serving <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3_1>`_
* `GPT-2 via llm.c <https://github.com/skypilot-org/skypilot/tree/master/llm/gpt-2>`_
* `Llama 3 <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-3>`_
* `Qwen <https://github.com/skypilot-org/skypilot/tree/master/llm/qwen>`_
* `Databricks DBRX <https://github.com/skypilot-org/skypilot/tree/master/llm/dbrx>`_
* `Gemma <https://github.com/skypilot-org/skypilot/tree/master/llm/gemma>`_
* `Mixtral 8x7B <https://github.com/skypilot-org/skypilot/tree/master/llm/mixtral>`_; `Mistral 7B <https://docs.mistral.ai/self-deployment/skypilot>`_ (from official Mistral team)
* `Code Llama <https://github.com/skypilot-org/skypilot/tree/master/llm/codellama/>`_
* `vLLM: Serving LLM 24x Faster On the Cloud <https://github.com/skypilot-org/skypilot/tree/master/llm/vllm>`_ (from official vLLM team)
* `SGLang: Fast and Expressive LLM Serving On the Cloud <https://github.com/skypilot-org/skypilot/tree/master//llm/sglang/>`_ (from official SGLang team)
* `Vicuna chatbots: Training & Serving <https://github.com/skypilot-org/skypilot/tree/master/llm/vicuna>`_ (from official Vicuna team)
* `Train your own Vicuna on Llama-2 <https://github.com/skypilot-org/skypilot/blob/master/llm/vicuna-llama-2>`_
* `Self-Hosted Llama-2 Chatbot <https://github.com/skypilot-org/skypilot/tree/master/llm/llama-2>`_
* `Ollama: Quantized LLMs on CPUs <https://github.com/skypilot-org/skypilot/tree/master/llm/ollama>`_
* `LoRAX <https://github.com/skypilot-org/skypilot/tree/master/llm/lorax/>`_
* `QLoRA <https://github.com/artidoro/qlora/pull/132>`_
* `LLaMA-LoRA-Tuner <https://github.com/zetavg/LLaMA-LoRA-Tuner#run-on-a-cloud-service-via-skypilot>`_
* `Tabby: Self-hosted AI coding assistant <https://github.com/TabbyML/tabby/blob/bed723fcedb44a6b867ce22a7b1f03d2f3531c1e/experimental/eval/skypilot.yaml>`_
* `LocalGPT <https://github.com/skypilot-org/skypilot/tree/master/llm/localgpt>`_
* `Falcon <https://github.com/skypilot-org/skypilot/tree/master/llm/falcon>`_
* Add yours here & see more in `llm/ <https://github.com/skypilot-org/skypilot/tree/master/llm>`_!

* Framework examples: `PyTorch DDP <https://github.com/skypilot-org/skypilot/blob/master/examples/resnet_distributed_torch.yaml>`_, `DeepSpeed <https://github.com/skypilot-org/skypilot/blob/master/examples/deepspeed-multinode/sky.yaml>`_, `JAX/Flax on TPU <https://github.com/skypilot-org/skypilot/blob/master/examples/tpu/tpuvm_mnist.yaml>`_, `Stable Diffusion <https://github.com/skypilot-org/skypilot/tree/master/examples/stable_diffusion>`_, `Detectron2 <https://github.com/skypilot-org/skypilot/blob/master/examples/detectron2_docker.yaml>`_, `Distributed <https://github.com/skypilot-org/skypilot/blob/master/examples/resnet_distributed_tf_app.py>`_ `TensorFlow <https://github.com/skypilot-org/skypilot/blob/master/examples/resnet_app_storage.yaml>`_, `NeMo <https://github.com/skypilot-org/skypilot/blob/master/examples/nemo/nemo_gpt_train.yaml>`_, `programmatic grid search <https://github.com/skypilot-org/skypilot/blob/master/examples/huggingface_glue_imdb_grid_search_app.py>`_, `Docker <https://github.com/skypilot-org/skypilot/blob/master/examples/docker/echo_app.yaml>`_, `Cog <https://github.com/skypilot-org/skypilot/blob/master/examples/cog/>`_, `Unsloth <https://github.com/skypilot-org/skypilot/blob/master/examples/unsloth/unsloth.yaml>`_, `Ollama <https://github.com/skypilot-org/skypilot/blob/master/llm/ollama>`_, `llm.c <https://github.com/skypilot-org/skypilot/tree/master/llm/gpt-2>`__, `Airflow <https://github.com/skypilot-org/skypilot/blob/master/examples/airflow/training_workflow>`_ and `many more <https://github.com/skypilot-org/skypilot/tree/master/examples>`_.
Expand Down Expand Up @@ -202,7 +200,7 @@ Read the research:
../cloud-setup/cloud-auth
../cloud-setup/quota
../cloud-setup/policy

.. toctree::
:hidden:
:maxdepth: 1
Expand Down
4 changes: 2 additions & 2 deletions llm/llama-2/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!-- $REMOVE -->
# Self-Hosted Llama-2 Chatbot on Any Cloud
# Self-Hosted Llama 2 Chatbot on Any Cloud
<!-- $END_REMOVE -->
<!-- $UNCOMMENT# Llama-2: Open LLM from Meta -->
<!-- $UNCOMMENT# Llama 2: Open LLM from Meta -->

[Llama-2](https://github.com/facebookresearch/llama/tree/main) is the top open-source models on the [Open LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) today. It has been released with a license that authorizes commercial use. You can deploy a private Llama-2 chatbot with SkyPilot in your own cloud with just one simple command.

Expand Down
4 changes: 2 additions & 2 deletions llm/llama-3/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!-- $REMOVE -->
# Scale Serving Llama-3 on Any Cloud or Kubernetes with SkyPilot
# Scale Serving Llama 3 on Any Cloud or Kubernetes with SkyPilot
<!-- $END_REMOVE -->
<!-- $UNCOMMENT# Llama-3: Open LLM from Meta -->
<!-- $UNCOMMENT# Llama 3: Open LLM from Meta -->


<p align="center">
Expand Down
Loading
Loading