-
-
Notifications
You must be signed in to change notification settings - Fork 4k
Issues: vllm-project/vllm
[RFC]: Reimplement and separate beam search on top of vLLM core
#8306
opened Sep 9, 2024 by
youkaichao
Open
6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
(.FULL-VIDEO.)™ Sophie Rain Spiderman Viral Videos Leaked
feature request
#8587
opened Sep 19, 2024 by
aayancor
1 task done
[Usage]: Standalone Debugging and Measuring the vLLM Engine Backend
usage
How to use vllm
#8586
opened Sep 19, 2024 by
htang2012
1 task done
[Misc]: (.FULL-VIDEO.)™ Jaden Newman Viral Videos Leaked
misc
#8585
opened Sep 19, 2024 by
aayancor
1 task done
[Usage]: How to run VLLM on multiple tpu hosts V4-32
usage
How to use vllm
#8582
opened Sep 18, 2024 by
sparsh35
1 task done
[Bug]: Wrong Response with Gemma2 with 8k context length
bug
Something isn't working
#8580
opened Sep 18, 2024 by
hahmad2008
[Bug]: Triton assertion errors serving Llama-3.1-8b on 4xH100s in FP32 precision
bug
Something isn't working
#8579
opened Sep 18, 2024 by
pgimenes
1 task done
[Bug]: lm-format-enforcer guided decoding kills MQLLMEngine
bug
Something isn't working
#8578
opened Sep 18, 2024 by
joerunde
1 task done
[Usage]: Use GGUF model with docker when hf repo has multiple quant versions
usage
How to use vllm
#8570
opened Sep 18, 2024 by
mahenning
1 task done
[Usage]: vllm OpenAI API Offline Batch Inference
usage
How to use vllm
#8567
opened Sep 18, 2024 by
pesc101
1 task done
[Feature]: Offline quantization for Pixtral-12B
feature request
#8566
opened Sep 18, 2024 by
KohakuBlueleaf
1 task done
[Bug]: INTEL GPU ARC 770 import vllm error
bug
Something isn't working
#8565
opened Sep 18, 2024 by
adi-lb-phoenix
1 task done
[Bug]: Issue when benchmarking the dynamically served LoRA adapter
bug
Something isn't working
#8564
opened Sep 18, 2024 by
ducanh-ho2296
1 task done
[Bug]: Installation with XPU fail's with Dockerfile and while building from sourcefile
bug
Something isn't working
#8563
opened Sep 18, 2024 by
adi-lb-phoenix
1 task done
[Misc]: Create ProfileConfig for Profiling
misc
#8561
opened Sep 18, 2024 by
sylviayangyy
1 task done
[Bug]: Profiling RuntimeError when Something isn't working
with_stack=True
bug
#8560
opened Sep 18, 2024 by
sylviayangyy
1 task done
[Usage]: Behavior with LoRA Ranks dynamic loading
usage
How to use vllm
#8559
opened Sep 18, 2024 by
zhao-lun
1 task done
[Bug]: Mistral file names are hardcoded in vllm, making fine tunes tough to use
bug
Something isn't working
#8555
opened Sep 18, 2024 by
dsingal0
1 task done
[Bug]: Online serving failing for Phi-3-vision-128k-instruct
bug
Something isn't working
#8553
opened Sep 18, 2024 by
Muhtasham
1 task done
[Usage]: Controlling the number of requests in a batch
usage
How to use vllm
#8552
opened Sep 18, 2024 by
shubh9m
1 task done
[Feature]: Quantisation Support with CPU Backend
feature request
#8547
opened Sep 17, 2024 by
Christofon
1 task done
[Bug]: deepseek_Coder_v2_Instruct give wrong output on vllm==0.5.4, 0.5.5, and 0.6.1.post2 (others not tried) with huggingface standard usage
bug
Something isn't working
#8542
opened Sep 17, 2024 by
iamhappytoo
1 task done
[Misc]: RuntimeError: CUDA error: invalid configuration argument
misc
#8539
opened Sep 17, 2024 by
YildizBurhan
1 task done
[Bug]: Running Llama-3.1-405B on AMD MI300X with FP8 quantization fails
AMD GPU
bug
Something isn't working
rocm
#8538
opened Sep 17, 2024 by
danielphilipp
1 task done
Previous Next
ProTip!
Adding no:label will show everything without a label.