huggingface / text-generation-inference Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 10.1k

Code
Issues 243
Pull requests 28
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: huggingface/text-generation-inference

Labels 14 Milestones 1

New pull request New

28 Open 1,597 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Prepare for 3.2.4

#3218 opened May 9, 2025 by danieldk

Loading…

5 tasks

change HPU warmup logic: seq length should be with exponential growth

#3217 opened May 9, 2025 by kaixuanliu

Loading…

Deepseek r1

#3211 opened May 7, 2025 by sywangyi • Draft

5 tasks

Fix typos

#3210 opened May 6, 2025 by omahs

Loading…

1 of 5 tasks

feat: lock updated kernel versions

#3201 opened Apr 29, 2025 by drbh

Loading…

Set uv UV_PYTHON_INSTALL_DIR explicitly

#3197 opened Apr 27, 2025 by sebastianliebscher

Loading…

1 of 5 tasks

README: minimum Python version is 3.10

#3194 opened Apr 25, 2025 by Frenzie

Loading…

1 of 5 tasks

feat: support logit bias in chat request

#3186 opened Apr 22, 2025 by drbh

Loading…

Fix flashinfer plan call to use positional arguments for #3165

#3166 opened Apr 11, 2025 by ruckc

Loading…

2 of 5 tasks

Update to flashinfer 0.2.5

#3164 opened Apr 11, 2025 by danieldk • Draft

5 tasks

Add chunked attn for L4

#3162 opened Apr 10, 2025 by mht-sharma • Draft

2 of 7 tasks

Gaudi: add CI

#3160 opened Apr 10, 2025 by baptistecolle • Draft

Update links Inferentia refer docs

#3154 opened Apr 9, 2025 by guspan-tanadi

Loading…

1 of 5 tasks

feat: align function id with tool call response

#3111 opened Mar 13, 2025 by drbh

Loading…

wip: comment out prepend full_text

#3079 opened Mar 7, 2025 by jrc2139 • Draft

1 of 5 tasks

Expose the real-time internal state of the batcher through SSE

#3065 opened Feb 27, 2025 by mfuntowicz • Draft

Added model name label to metrics and added an optional argument --served-model-name wontfix

This will not be worked on

#3064 opened Feb 27, 2025 by yashaswipiplani

Loading…

display available cached versions in TGI server error message of Neuron backend

#3063 opened Feb 26, 2025 by jimburtoft

Loading…

4 tasks

Support xccl distributed backend

#3034 opened Feb 18, 2025 by dvrogozh

Loading…

Fix CPU and memory affinity under external resource management

#3012 opened Feb 11, 2025 by askervin

Loading…

[Backend] Introduce vLLM backend

#2976 opened Jan 31, 2025 by mfuntowicz

Loading…

Kvrouter that will increase the kv-cache hits in case of multiple routing strategy

#2965 opened Jan 29, 2025 by Narsil

Loading…

5 tasks

misc(gha): expose action cache url and runtime as secrets

#2964 opened Jan 29, 2025 by mfuntowicz

Loading…

llava next image encoder to allow un-aligned patch / image sizes

#2936 opened Jan 22, 2025 by jimexist

Loading…

5 tasks

Update Dockerfile to use devel image for compatibility

#2848 opened Dec 16, 2024 by YaserJaradeh

Loading…

2 of 5 tasks

Previous 1 2 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly