Releases: huggingface/optimum-benchmark
Releases · huggingface/optimum-benchmark
v0.4.0
What's Changed
- Refactor backends and add
load
tracking by @IlyasMoutawwakil in #227 - Update readme by @IlyasMoutawwakil in #228
- Update vllm backend to support offline and online serving modes by @IlyasMoutawwakil in #232
- Misc CI updates and multi-platform support by @IlyasMoutawwakil in #233
- Add llama.cpp backend by @baptistecolle in #231
- Misc changes and fixes for llama cpp by @IlyasMoutawwakil in #236
- release by @IlyasMoutawwakil in #237
New Contributors
- @baptistecolle made their first contribution in #231
Full Changelog: v0.3.1...v0.4.0
v0.3.1
What's Changed
- Fix per token latency by @IlyasMoutawwakil in #223
- Per token latency outliers by @IlyasMoutawwakil in #225
- Patch release by @IlyasMoutawwakil in #224
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Remove experiment schema by @IlyasMoutawwakil in #210
- Numactl support by @IlyasMoutawwakil in #211
- Fix sentence transformers models by @IlyasMoutawwakil in #212
- Enable security checks by @mfuntowicz in #216
- Fix
PyTorchBackend
TP vs DP inputs distribution across replicas and shards by @IlyasMoutawwakil in #218 - Pin eager attn in torch-ort backend by @IlyasMoutawwakil in #219
- Fix INC by @IlyasMoutawwakil in #220
- bump version 0.3.0 by @IlyasMoutawwakil in #221
New Contributors
- @mfuntowicz made their first contribution in #216
Full Changelog: v0.2.1...v0.3.0
v0.2.1
What's Changed
- Llm perf update by @IlyasMoutawwakil in #206
- Fix diffusers repo id naming by @IlyasMoutawwakil in #208
- Release by @IlyasMoutawwakil in #209
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- [feature][refactor] Optimum-Benchmark API by @IlyasMoutawwakil in #118
- [feature][refactor] Benchmark Reporting + Hub Mixin by @IlyasMoutawwakil in #122
- [feature][refactor] Better Metrics and Trackers by @IlyasMoutawwakil in #124
- moved complex examples by @IlyasMoutawwakil in #127
- Fix ort inputs filtering by @IlyasMoutawwakil in #129
- Support per token measurements through logits processor by @IlyasMoutawwakil in #130
- Fix git revisions by @IlyasMoutawwakil in #131
- Support rocm benchmarking with text generation inference backend by @IlyasMoutawwakil in #132
- Use Py-TGI and add testing by @IlyasMoutawwakil in #134
- Text2Text input generator by @IlyasMoutawwakil in #139
- Faster DeepSpeed engine initialization by @IlyasMoutawwakil in #140
- llm-swarm backend integration for slurm clusters by @IlyasMoutawwakil in #142
- Better hub utils by @IlyasMoutawwakil in #143
- Support Py-TXI (TGI and TEI) by @IlyasMoutawwakil in #147
- Migrate CUDA CI workflows by @IlyasMoutawwakil in #156
- Fix: Enable Energy Calculation in Benchmarking by Implementing Subtraction Method by @karthickai in #149
- add test configurations for quantization with onnxruntime, awq, bnb (#95) by @aliabdelkader in #144
- Fix gptq exllamav2 check by @IlyasMoutawwakil in #152
- Fix gptq exllamav2 check by @IlyasMoutawwakil in #157
- Compute the real prefill latency using the logits processor by @IlyasMoutawwakil in #150
- zentorch plugin support by @IlyasMoutawwakil in #162
- torch compile diffusers vae by @IlyasMoutawwakil in #163
- Fix Exllama V2 typo by @IlyasMoutawwakil in #165
- Added test llama-2-7b with GPTQ quant. scheme by @lopozz in #141
- Update zentorch plugin by @IlyasMoutawwakil in #167
- Fix
to_csv
andto_dataframe
by @IlyasMoutawwakil in #168 - add test configurations to run Torch compile (#95) by @aliabdelkader in #155
- Images builder CI by @IlyasMoutawwakil in #171
- Refactor test configs and CI by @IlyasMoutawwakil in #170
- Add OpenVINO GPU support by @helena-intel in #172
- Update readme, examples, makefile by @IlyasMoutawwakil in #173
- Add energy star benchmark by @regisss in #169
- Remove rocm5.6 support and add global vram tracking using pyrsmi by @IlyasMoutawwakil in #174
- Explicitly passing visible devices to isolation process by @IlyasMoutawwakil in #177
- Trackers revamp by @IlyasMoutawwakil in #178
- Full hub mixin integration by @IlyasMoutawwakil in #179
- Fix tasks by @IlyasMoutawwakil in #181
- fix py-txi by @IlyasMoutawwakil in #182
- specify which repo to push to by @IlyasMoutawwakil in #183
- Refactor prefill and inference benchmark by @IlyasMoutawwakil in #184
- Add LLM-Perf script and CI by @IlyasMoutawwakil in #185
- Fix isolation by @IlyasMoutawwakil in #186
- First warmup with the same input/output as benchmark by @IlyasMoutawwakil in #188
- Remove unnecessary surrogates attached to double quotes by @yamaura in #192
- save benchmark in files instead of passing them through a queue by @IlyasMoutawwakil in #191
- [refactor] add scenarios, drop experiments by @IlyasMoutawwakil in #187
- Use queues to not pollute cwd by @IlyasMoutawwakil in #193
- Update llm perf by @IlyasMoutawwakil in #195
- Gather llm perf benchmarks by @IlyasMoutawwakil in #198
- Build and Publish Images by @IlyasMoutawwakil in #199
- Communicate error/exception/traceback with main process by @IlyasMoutawwakil in #200
- vLLM backend by @IlyasMoutawwakil in #196
- update readme by @IlyasMoutawwakil in #201
- added 1xA100 by @IlyasMoutawwakil in #202
- Test quality for different python versions by @IlyasMoutawwakil in #203
- release by @IlyasMoutawwakil in #204
- v0.2.0 release bis by @IlyasMoutawwakil in #205
New Contributors
- @karthickai made their first contribution in #149
- @aliabdelkader made their first contribution in #144
- @lopozz made their first contribution in #141
- @helena-intel made their first contribution in #172
- @regisss made their first contribution in #169
- @yamaura made their first contribution in #192
Full Changelog: 0.0.1...v0.2.0