Skip to content

Pull requests: analytics-zoo/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Update prefix benchmark to latest version (0.6.5)
#69 opened Dec 20, 2024 by xiangyuT Loading…
Enable gemma model
#68 opened Dec 18, 2024 by hzjane Loading…
Add batch sdp_causal
#53 opened Nov 14, 2024 by xiangyuT Loading…
Add xpu communicator
#50 opened Nov 11, 2024 by gc-fu Loading…
Enable chunked_prefill and prefix caching
#48 opened Nov 7, 2024 by hzjane Loading…
Add multi-steps scheduler sycl kernel
#46 opened Oct 29, 2024 by gc-fu Loading…
Add chunked_prefill
#38 opened Sep 25, 2024 by gc-fu Loading…
First-phase: add sdp kernel without mask
#32 opened Sep 11, 2024 by gc-fu Loading…
fix qwen wired tuple issue
#21 opened Jun 17, 2024 by gc-fu Loading…
Split qkv and lm head to low bit
#20 opened Jun 5, 2024 by hkvision Draft
Disable ray
#18 opened Apr 8, 2024 by gc-fu Loading…
Bigdl llm vllm xpu
#16 opened Mar 12, 2024 by yangw1234 Loading…
ProTip! Follow long discussions with comments:>50.