Skip to content

Pull requests: huggingface/text-generation-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Make moe-kernels and marlin-kernels mandatory in CUDA installs
#2632 opened Oct 10, 2024 by danieldk Loading…
5 tasks
Clarify gated description and quicktour
#2631 opened Oct 9, 2024 by osanseviero Loading…
Add support for FP8 KV cache scales
#2628 opened Oct 9, 2024 by danieldk Draft
5 tasks
feat: add test for positional rotary embeddings
#2623 opened Oct 8, 2024 by drbh Loading…
Test Marlin MoE with desc_act=true
#2622 opened Oct 8, 2024 by danieldk Loading…
5 tasks
[DOCS] Add Google Cloud TGI integration via dedicated DLCs
#2612 opened Oct 5, 2024 by alvarobartt Loading…
1 of 5 tasks
Simplify the attention function
#2609 opened Oct 4, 2024 by danieldk Loading…
5 tasks
feat: prefill chunking
#2600 opened Oct 2, 2024 by OlivierDehaene Loading…
Cpu perf
#2596 opened Oct 1, 2024 by Narsil Loading…
5 tasks
feat: Add automatic nightly benchmarks
#2591 opened Sep 30, 2024 by Hugoch Loading…
1 of 5 tasks
Fp8 e4m3_fnuz support for rocm
#2588 opened Sep 30, 2024 by mht-sharma Loading…
5 tasks
break when there's nothing to read
#2582 opened Sep 28, 2024 by sywangyi Loading…
5 tasks
We can have a tokenizer anywhere.
#2527 opened Sep 17, 2024 by Narsil Loading…
5 tasks
Small fixes for supported models
#2471 opened Aug 29, 2024 by osanseviero Loading…
add gptq and awq int4 support in intel platform
#2444 opened Aug 22, 2024 by sywangyi Loading…
5 tasks
Improve vlm support (add idefics3 support)
#2437 opened Aug 20, 2024 by drbh Draft
4 tasks
ProTip! Filter pull requests by the default branch with base:main.