Skip to content

Pull requests: huggingface/text-generation-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add model_load_time metric
#2311 opened Jul 26, 2024 by Edwinhr716 updated Aug 9, 2024
2 of 5 tasks
[WIP] Add gfx1100 support to AMD pytorch build
#2642 opened Oct 13, 2024 by cazlo Draft updated Dec 10, 2024
1 of 5 tasks
Enable qwen2vl video
#2756 opened Nov 18, 2024 by drbh Loading… updated Jan 17, 2025
9 tasks done
Fix tool call response to adhere to OpenAI spec
#2949 opened Jan 24, 2025 by Datta0 Loading… updated Jan 24, 2025
[Backend] Introduce vLLM backend
#2976 opened Jan 31, 2025 by mfuntowicz Loading… updated Jan 31, 2025
Kvrouter that will increase the kv-cache hits in case of multiple routing strategy
#2965 opened Jan 29, 2025 by Narsil Loading… updated Jan 31, 2025
5 tasks
Add 'json_schema' alias to GrammarType.Json
#2982 opened Jan 31, 2025 by aW3st Loading… updated Jan 31, 2025
2 of 5 tasks
Get opentelemetry trace id from request headers instead of creating a new trace
#2648 opened Oct 15, 2024 by kozistr Loading… updated Feb 2, 2025
3 of 5 tasks
Update Dockerfile to use devel image for compatibility
#2848 opened Dec 16, 2024 by YaserJaradeh Loading… updated Feb 3, 2025
2 of 5 tasks
misc(gha): expose action cache url and runtime as secrets
#2964 opened Jan 29, 2025 by mfuntowicz Loading… updated Feb 5, 2025
General fixes for tool calling
#2954 opened Jan 24, 2025 by Trofleb Loading… updated Feb 10, 2025
2 of 4 tasks
llava next image encoder to allow un-aligned patch / image sizes
#2936 opened Jan 22, 2025 by jimexist Loading… updated Feb 10, 2025
5 tasks
change ChatCompletionChunk to align with "OpenAI Chat Completions str…
#3003 opened Feb 10, 2025 by sywangyi Loading… updated Feb 11, 2025
dockerfile change to ipex cpu/xpu
#3013 opened Feb 11, 2025 by sywangyi Loading… updated Feb 11, 2025
Fix CPU and memory affinity under external resource management
#3012 opened Feb 11, 2025 by askervin Loading… updated Feb 18, 2025
Add request parameters to OTel span for /v1/chat/completions endpoint
#3000 opened Feb 7, 2025 by aW3st Loading… updated Feb 19, 2025
1 of 5 tasks
Support xccl distributed backend
#3034 opened Feb 18, 2025 by dvrogozh Loading… updated Feb 19, 2025
Pr 2982 ci branch
#3046 opened Feb 20, 2025 by drbh Loading… updated Feb 20, 2025
Pr 3003 ci branch
#3007 opened Feb 10, 2025 by drbh Loading… updated Feb 21, 2025
push layer compressed with zstd instead of gzip
#2980 opened Jan 31, 2025 by co42 Loading… updated Feb 21, 2025
some minor fix
#3048 opened Feb 21, 2025 by sywangyi Loading… updated Feb 21, 2025
Pr 2954 ci branch
#3006 opened Feb 10, 2025 by drbh Loading… updated Feb 21, 2025
You need to seek apparently.
#3049 opened Feb 21, 2025 by Narsil Loading… updated Feb 21, 2025
5 tasks
Add Neuron backend
#3033 opened Feb 18, 2025 by dacorvo Loading… updated Feb 21, 2025
Update the llamacpp backend
#3022 opened Feb 14, 2025 by angt Loading… updated Feb 22, 2025
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.