Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add docs for Sky Serve #2794

Merged
merged 19 commits into from
Dec 11, 2023
Merged

[Docs] Add docs for Sky Serve #2794

merged 19 commits into from
Dec 11, 2023

Conversation

cblmemo
Copy link
Collaborator

@cblmemo cblmemo commented Nov 16, 2023

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

Copy link
Member

@concretevitamin concretevitamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
.. note::

The :code:`curl` command won't follow the redirect and print the content of the redirected page by default. Since we are using HTTP-redirect, you need to use :code:`curl -L <endpoint-url>`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a section here, e.g., "Example: Text Generation Inference (TGI)", which can consist of the two snippets in https://docs.google.com/document/d/1vVmzLF-EkG3Moj-q47DQBGvFipK4PNfkz0V6LyaPstE/edit#heading=h.gntyowdq9a18 or https://docs.google.com/document/d/1vVmzLF-EkG3Moj-q47DQBGvFipK4PNfkz0V6LyaPstE/edit#heading=h.gr15nxiws63p

The value is it's much shorter --> easier to adapt. It also quickly shows the idea of one endpoint being backed by multiple regions/clouds' replicas.

Can discuss whether to put it in the opening section of this page, like User Docs does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding this, do you think we still need the vicuna example? Not sure if it is a little bit redundant if we include TGI...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some redundancy is fine. The main motivation is to make the very first impression about real, useful AI serving. Currently it's HTTP server.

How about we add a "Quickstart: TGI service" section (or change it to vLLM/FastChat etc.), but keep the user doc's concise formatting -- 1 snippet showing YAML, 1 snippet showing service status, then add 1 snippet showing how to CURL it correctly. With minimal text throughout.

docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
@concretevitamin
Copy link
Member

We also may want to replace some snippets (e.g., sky serve status?) with screenshots which look way better with colors than the current formattin.

Copy link
Collaborator

@landscapepainter landscapepainter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a chance to go thorugh the doc to learn how SkyServe works. Left some comments on what I found.

docs/source/examples/service-yaml-spec.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
cblmemo and others added 7 commits December 8, 2023 14:35
Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>
Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>
Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>
docs/source/index.rst Outdated Show resolved Hide resolved
docs/source/index.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/service-yaml-spec.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
docs/source/examples/sky-serve.rst Outdated Show resolved Hide resolved
:width: 800
:align: center
:alt: sky-serve-status-vicuna-ready

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a CURL based command like the user doc. Does something like this work

Then it’s ready to accept traffic!

$ curl -L Y.Y.Y.Y:8082/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json'
{"generated_text":"\n nobody knows"}

.. note::

The :code:`curl` command won't follow the redirect and print the content of the redirected page by default. Since we are using HTTP-redirect, you need to use :code:`curl -L <endpoint-url>`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some redundancy is fine. The main motivation is to make the very first impression about real, useful AI serving. Currently it's HTTP server.

How about we add a "Quickstart: TGI service" section (or change it to vLLM/FastChat etc.), but keep the user doc's concise formatting -- 1 snippet showing YAML, 1 snippet showing service status, then add 1 snippet showing how to CURL it correctly. With minimal text throughout.

@concretevitamin concretevitamin merged commit 8616cc5 into master Dec 11, 2023
19 checks passed
@concretevitamin concretevitamin deleted the serve-docs branch December 11, 2023 00:46
remyleone pushed a commit to remyleone/skypilot that referenced this pull request Dec 26, 2023
* add doc

* Rewording

* refactor pic

* apply suggestions from code review

* Update docs/source/examples/sky-serve.rst

Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>

* Update docs/source/examples/sky-serve.rst

Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>

* Update docs/source/examples/sky-serve.rst

Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>

* upd

* fix confusing required

* add graph

* image size

* Apply suggestions from code review

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>

* apply suggestions from code review

* upd

* Update docs/source/index.rst

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>

* upd pic & output

* fix

* updates

---------

Co-authored-by: Zongheng Yang <zongheng.y@gmail.com>
Co-authored-by: Doyoung Kim <34902420+landscapepainter@users.noreply.github.com>
@cblmemo cblmemo mentioned this pull request Jan 27, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants