Skip to content

Conversation

@gesanqiu
Copy link
Contributor

@gesanqiu gesanqiu commented Jul 2, 2023

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@merrymercy
Copy link
Contributor

merrymercy commented Jul 2, 2023

The vLLM integration for OpenAI API server has been fixed by lm-sys/FastChat#1835.
Could you test it? It should be compatible with (completion, chat-completion) x (streaming, non-streaming)

With vLLM only, you get simplicity and continuous batching.
With FastChat + vllm_worker, you get a distributed multi-model multi-worker controller + continuous batching.

@gesanqiu
Copy link
Contributor Author

gesanqiu commented Jul 2, 2023

The vLLM integration for OpenAI API server has been fixed by lm-sys/FastChat#1835. Could you test it? It should be compatible with (completion, chat-completion) x (streaming, non-streaming)

With vLLM only, you get simplicity and continuous batching. With FastChat + vllm_worker, you get a distributed multi-model multi-worker controller + continuous batching.

I will try it tomorrow, thx.

Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! The changes look good to me in general. Left some small comments to add advanced sampling functionality into the ChatCompletion API.

object: str = "chat.completion.chunk"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[ChatCompletionResponseStreamChoice]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: new line

@zhuohan123 zhuohan123 merged commit 49b26e2 into vllm-project:main Jul 3, 2023
@gesanqiu gesanqiu deleted the chatcompletion branch July 7, 2023 02:49
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
Co-authored-by: dhuangnm <dhuang@MacBook-Pro-2.local>
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Sep 30, 2024
This PR fixes all the little warnings gaudi-installation.rst introduces
during documentation build ("WARNING: Title underline too short." etc.)
billishyahao pushed a commit to billishyahao/vllm that referenced this pull request Dec 31, 2024
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
### What this PR does / why we need it?
Fix bugs of installation doc and format tool.

### Does this PR introduce _any_ user-facing change?
no.

### How was this patch tested?
no.

Signed-off-by: shen-shanshan <467638484@qq.com>
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Sep 16, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
yma11 added a commit to yma11/vllm that referenced this pull request Sep 25, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
yma11 added a commit to yma11/vllm that referenced this pull request Oct 10, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Oct 11, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Oct 27, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
yma11 added a commit to yma11/vllm that referenced this pull request Nov 5, 2025
* layernorm use vllm_xpu_kernels

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* [ww34] switch silu_and_mul, reshape_and_cache_flash, rope to xpu kernel

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* update activation kernels

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* try remove ipex

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* switch to xpu kernel for w8a16 gemm (vllm-project#323)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable cutlass chunked-prefill (vllm-project#330)

* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable topk/grouped_gemm based on llama4 (vllm-project#354)

* enable topk/grouped_gemm based on llama4

Signed-off-by: Yan Ma <yan.ma@intel.com>

* address comments

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable CI

* replace lora kernels (vllm-project#347)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* remove ipex (vllm-project#370)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* update QA CI branch

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* fix conflict

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Liu, Wenjun <wenjun.liu@intel.com>
yma11 added a commit to yma11/vllm that referenced this pull request Nov 10, 2025
* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
yma11 added a commit to yma11/vllm that referenced this pull request Nov 10, 2025
* layernorm use vllm_xpu_kernels

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* [ww34] switch silu_and_mul, reshape_and_cache_flash, rope to xpu kernel

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* update activation kernels

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* try remove ipex

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* switch to xpu kernel for w8a16 gemm (vllm-project#323)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable cutlass chunked-prefill (vllm-project#330)

* enable cutlass chunked-prefill

Signed-off-by: Yan Ma <yan.ma@intel.com>

* add required pkg for xpu-kernels compilation

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable topk/grouped_gemm based on llama4 (vllm-project#354)

* enable topk/grouped_gemm based on llama4

Signed-off-by: Yan Ma <yan.ma@intel.com>

* address comments

Signed-off-by: Yan Ma <yan.ma@intel.com>

---------

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* enable CI

* replace lora kernels (vllm-project#347)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* remove ipex (vllm-project#370)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* update QA CI branch

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* update QA CI yaml

* fix conflict

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Liu, Wenjun <wenjun.liu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants