26 Sep 13:46

XprobeBot

aac134d

v0.5.1

What's new in 0.5.1 (2023-09-26)

These are the changes in inference v0.5.1.

Enhancements

ENH: Safe iterate stream of ggml model by @codingl2k1 in #449
ENH: Skip download if model exists by @aresnow1 in #495

Documentation

DOC: vLLM by @UranusSeven in #491

Full Changelog: v0.5.0...v0.5.1

Contributors

aresnow1, UranusSeven, and codingl2k1

Assets 2

22 Sep 09:56

XprobeBot

v0.5.0

1b4e14f

v0.5.0

What's new in 0.5.0 (2023-09-22)

These are the changes in inference v0.5.0.

New features

FEAT: incorporate vLLM by @UranusSeven in #445
FEAT: add register model page for dashboard by @Bojun-Feng in #420
FEAT: internlm 20b by @UranusSeven in #486
FEAT: support glaive coder by @UranusSeven in #490
FEAT: Support download models from modelscope by @aresnow1 in #475

Enhancements

ENH: shorten OpenBuddy's desc by @UranusSeven in #471
ENH: enable vLLM on Linux with cuda by @UranusSeven in #472
ENH: vLLM engine supports more models by @UranusSeven in #477
ENH: remove subpool on failure by @UranusSeven in #478
ENH: support trust_remote_code when launching a model by @UranusSeven in #479
ENH: vLLM auto tensor parallel by @UranusSeven in #480

Bug fixes

BUG: llama-cpp version dismatch by @Bojun-Feng in #473
BUG: incorrect endpoint on host 0.0.0.0 by @UranusSeven in #474
BUG: prompt style not set as expected on web UI by @UranusSeven in #489

Tests

TST: Fix windows CI by @aresnow1 in #455

Documentation

DOC: Getting started guide for beginners by @onesuper in #460

Full Changelog: v0.4.4...v0.5.0

Contributors

onesuper, Bojun-Feng, and 2 other contributors

Assets 2

19 Sep 04:38

XprobeBot

v0.4.4

f227c93

v0.4.4

What's new in 0.4.4 (2023-09-19)

These are the changes in inference v0.4.4.

Bug fixes

BUG: stop auto download from self-hosted storage for locale zh_CN by @UranusSeven in #465

Full Changelog: v0.4.3...v0.4.4

Contributors

UranusSeven

Assets 2

16 Sep 04:54

XprobeBot

v0.4.3

fd64f71

v0.4.3

What's new in 0.4.3 (2023-09-16)

These are the changes in inference v0.4.3.

Others

CHORE: Specify xoscar version by @aresnow1 in #457

Full Changelog: v0.4.2...v0.4.3

Contributors

aresnow1

Assets 2

15 Sep 11:26

XprobeBot

v0.4.2

f8df104

v0.4.2

What's new in 0.4.2 (2023-09-15)

These are the changes in inference v0.4.2.

New features

FEAT: concurrent generation by @codingl2k1 in #417
FEAT: Support gguf by @aresnow1 in #446
FEAT: Support OpenBuddy by @codingl2k1 in #444

Enhancements

ENH: client support desc model by @UranusSeven in #442
ENH: caching from self-hosted storage by @UranusSeven in #419
ENH: Assign worker sub pool at runtime instead of pre-allocated by @ChengjieLi28 in #437
ENH: add benchmark script by @UranusSeven in #451

Bug fixes

BUG: Fix restful client for embedding models by @aresnow1 in #439
BUG: cmdline double line breaker by @UranusSeven in #441
BUG: no error raised on unsupported fmt by @UranusSeven in #443
BUG: Xinferecen list failed if embedding models are launched by @aresnow1 in #452

Tests

TST: skip self-hosted storage tests by @UranusSeven in #453

Documentation

DOC: fix baichuan-2 and make naming consistent by @UranusSeven in #432
DOC: update hot topics by @UranusSeven in #456

Others

CI: Fix Windows CI by @codingl2k1 in #440

New Contributors

@ChengjieLi28 made their first contribution in #437

Full Changelog: v0.4.1...v0.4.2

Contributors

aresnow1, ChengjieLi28, and 2 other contributors

Assets 2

07 Sep 05:47

XprobeBot

v0.4.1

3fff879

v0.4.1

What's new in 0.4.1 (2023-09-07)

These are the changes in inference v0.4.1.

Bug fixes

BUG: Searching in UI results in white screen by @Bojun-Feng in #431
BUG: Include json in MANIFEST.in by @aresnow1 in #435

Documentation

DOC: Embedding models by @aresnow1 in #421

Full Changelog: v0.4.0...v0.4.1

Contributors

Bojun-Feng and aresnow1

Assets 2

06 Sep 14:17

XprobeBot

v0.4.0

4ffc1b7

v0.4.0

What's new in 0.4.0 (2023-09-06)

These are the changes in inference v0.4.0.

New features

FEAT: Support CodeLlama-Instruct by @jiayini1119 in #414
FEAT: Add embedding models support by @aresnow1 in #418
FEAT: Support replica by @codingl2k1 in #410
FEAT: support baichuan2 by @UranusSeven in #425

Bug fixes

BUG: cmdline chat duplicates user msg by @UranusSeven in #428
BUG: llama_cpp model context length by @UranusSeven in #429

Documentation

DOC: update readme by @UranusSeven in #423

New Contributors

@codingl2k1 made their first contribution in #410

Full Changelog: v0.3.0...v0.4.0

Contributors

jiayini1119, aresnow1, and 2 other contributors

Assets 2

04 Sep 13:35

XprobeBot

v0.3.0

ce528cc

v0.3.0

What's new in 0.3.0 (2023-09-04)

These are the changes in inference v0.3.0.

Enhancements

ENH: help message for CLI by @Bojun-Feng in #367

Bug fixes

BUG: Asking to pad but the tokenizer does not have a padding token by @jiayini1119 in #407
BUG: empty results for non-stream inference by @UranusSeven in #415
BUG: Make context_length optional in model family by @Bojun-Feng in #394

Others

EHN: auto retry download on network errors by @jiayini1119 in #405
FEAT : Add Model Dashboard by @Bojun-Feng in #366

Full Changelog: v0.2.3...v0.3.0

Contributors

Bojun-Feng, jiayini1119, and UranusSeven

Assets 2

30 Aug 06:26

XprobeBot

v0.2.3

4a23d57

v0.2.3

What's new in 0.2.3 (2023-08-30)

These are the changes in inference v0.2.3.

Bug fixes

BUG: fix subprocess log on linux by @UranusSeven in #357

Others

CHORE: lock llama-cpp-python version by @UranusSeven in #406

Full Changelog: v0.2.2...v0.2.3

Contributors

UranusSeven

Assets 2

25 Aug 15:47

XprobeBot

v0.2.2

8fa1630

v0.2.2

What's new in 0.2.2 (2023-08-25)

These are the changes in inference v0.2.2.

New features

FEAT: Support Llama-2 PyTorch model by @jiayini1119 in #387
FEAT: code-llama by @UranusSeven in #402

Enhancements

ENH: Update max_tokens to 32k by @Bojun-Feng in #386

Bug fixes

BUG: last token is duplicated by @UranusSeven in #398

Documentation

DOC: readme enhancements by @aresnow1 in #390

Others

fix chatglm params by @Bojun-Feng in #400

Full Changelog: v0.2.1...v0.2.2

Contributors

Bojun-Feng, jiayini1119, and 2 other contributors

Assets 2

Releases: xorbitsai/inference

v0.5.1

What's new in 0.5.1 (2023-09-26)

Enhancements

Documentation

Contributors

v0.5.0

What's new in 0.5.0 (2023-09-22)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

v0.4.4

What's new in 0.4.4 (2023-09-19)

Bug fixes

Contributors

v0.4.3

What's new in 0.4.3 (2023-09-16)

Others

Contributors

v0.4.2

What's new in 0.4.2 (2023-09-15)

New features

Enhancements

Bug fixes

Tests

Documentation

Others

New Contributors

Contributors

v0.4.1

What's new in 0.4.1 (2023-09-07)

Bug fixes

Documentation

Contributors

v0.4.0

What's new in 0.4.0 (2023-09-06)

New features

Bug fixes

Documentation

New Contributors

Contributors

v0.3.0

What's new in 0.3.0 (2023-09-04)

Enhancements

Bug fixes

Others

Contributors

v0.2.3

What's new in 0.2.3 (2023-08-30)

Bug fixes

Others

Contributors

v0.2.2

What's new in 0.2.2 (2023-08-25)

New features

Enhancements

Bug fixes

Documentation

Others

Contributors