Sync release with main for RHOAI 2.12 #110

…ct#4645) Co-authored-by: Swapnil Parekh <swapnilp@ibm.com> Co-authored-by: Joe G <joseph.granados@h2o.ai> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

this is the default when `--worker-use-ray` is not provided and world-size > 1

…hash

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

… and upstream"

fixed in vllm-project#6140 fixes https://issues.redhat.com/browse/RHOAIENG-8043

openshift-ci · 2024-07-26T15:29:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [dtrifiro]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* Putting server and LLM on different processes * result queues cleanup * Syncing server start with LLM init * Renamed model to protocol to match the original endpoint * Formatting * Removing tuning parameters that are no longer used * process spawn method from env * Cleanup and refactor * Where did this come from? * mypy much * One unhappy linter in the CI, one unhappy linter. Refactor it down, format it around, 3 unhappy linters in the CI

DarkLight1337 and others added 30 commits July 4, 2024 16:37

[VLM] Calculate maximum number of multi-modal tokens by model (vllm-p…

ae96ef8

…roject#6121)

[VLM] Improve consistency between feature size calculation and dummy …

a41357e

…data for profiling (vllm-project#6146)

[VLM] Cleanup validation and update docs (vllm-project#6149)

ea4b570

[Bugfix] Use templated datasource in grafana.json to allow automatic …

0097bb1

…imports (vllm-project#6136) Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>

[Frontend] Continuous usage stats in OpenAI completion API (vllm-proj…

f1e15da

…ect#5742)

[Bugfix] Add verbose error if scipy is missing for blocksparse attent…

e58294d

…ion (vllm-project#5695)

bump version to v0.5.1 (vllm-project#6157)

abad574

[Docs] Fix readthedocs for tag build (vllm-project#6158)

79d406e

Update wheel builds to strip debug (vllm-project#6161)

2de490d

Fix release wheel build env var (vllm-project#6162)

f025062

Move release wheel env var to Dockerfile instead (vllm-project#6163)

bc96d5c

[Doc] Reorganize Supported Models by Type (vllm-project#6167)

175c43e

[Doc] Move guide for multimodal model and other improvements (vllm-pr…

9389380

…oject#6168)

[Model] Add PaliGemma (vllm-project#5189)

6206dcb

Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

add benchmark for fix length input and output (vllm-project#5857)

333306a

Co-authored-by: Roger Wang <ywang@roblox.com>

[ Misc ] Support Fp8 via llm-compressor (vllm-project#6110)

abfe705

Co-authored-by: Robert Shaw <rshaw@neuralmagic>

[misc][frontend] log all available endpoints (vllm-project#6195)

3b08fe2

Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>

do not exclude object field in CompletionStreamResponse (vllm-proje…

16620f4

…ct#6196)

Feature/add benchmark testing (vllm-project#5947)

717f4bc

Co-authored-by: Roger Wang <ywang@roblox.com>

[Kernel] reloading fused_moe config on the last chunk (vllm-project#6210

f7a8fa3

)

[Kernel] Correctly invoke prefill & decode kernels for cross-attentio…

543aa48

…n (towards eventual encoder/decoder model support) (vllm-project#4888) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>

[Bugfix] use diskcache in outlines _get_guide vllm-project#5436 (vllm…

185ad31

…-project#6203)

[Bugfix] Mamba cache Cuda Graph padding (vllm-project#6214)

ddc369f

Add FlashInfer to default Dockerfile (vllm-project#6172)

4f0e0ea

[hardware][cuda] use device id under CUDA_VISIBLE_DEVICES for get_dev…

a3c9435

…ice_capability (vllm-project#6216)

[core][distributed] fix ray worker rank assignment (vllm-project#6235)

70c232f

[Bugfix][TPU] Add missing None to model input (vllm-project#6245)

5d5b4c5

[Bugfix][TPU] Fix outlines installation in TPU Dockerfile (vllm-proje…

08c5bde

…ct#6256)

Add support for multi-node on CI (vllm-project#5955)

a0550cb

Signed-off-by: kevin <kevin@anyscale.com>

[CORE] Adding support for insertion of soft-tuned prompts (vllm-proje…

4d6ada9

…ct#4645) Co-authored-by: Swapnil Parekh <swapnilp@ibm.com> Co-authored-by: Joe G <joseph.granados@h2o.ai> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>

dtrifiro and others added 19 commits July 23, 2024 20:20

Dockerfile.ubi: pin vllm-tgis-adapter to 0.1.2

e15634d

gha: fix fetch step in upstream sync workflow

b2fd1af

gha: always update sync workflow PR body/title

fd4204b

Dockerfile.ubi: bump vllm-tgis-adapter to 0.1.3

8551e8f

Dockerfile.ubi: get rid of --distributed-executor-backend=mp

5fe6a00

this is the default when `--worker-use-ray` is not provided and world-size > 1

Dockerfile.ubi: add flashinfer

f9ae74b

pin adapter to 2.0.0

280bc9f

deps: bump flashinfer to 0.0.9

b92b6d6

Update OWNERS with IBM folks

afd1436

Dockerfile.ubi: bind mount .git dir to allow inclusion of git commit …

1a74d61

…hash

gha: remove reminder_comment

d05d51f

Dockerfile: bump vllm-tgis-adapter to 0.2.1

97cd508

fix: update setup.py to differentiate between fork and upstream

08a7f70

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>

Dockerfile.ubi: properly mount .git dir

242ea7e

Revert "[CI/Build] fix: update setup.py to differentiate between fork…

76aa5cf

… and upstream"

Dockerfile.ubi: bump vllm-tgis-adapter to 0.2.2

61207a7

gha: remove unused upstream workflows

3c182aa

deps: bump vllm-tgis-adapter to 0.2.3

d379e0a

Dockerfile.ubi: get rid of custom cache manager

7a21f52

fixed in vllm-project#6140 fixes https://issues.redhat.com/browse/RHOAIENG-8043

openshift-merge-robot added the needs-rebase label Jul 26, 2024

openshift-ci bot requested review from rpancham and Xaenalt July 26, 2024 15:29

openshift-ci bot added the approved label Jul 26, 2024

openshift-merge-robot removed the needs-rebase label Jul 26, 2024

dtrifiro force-pushed the sync-release-with-main branch from b56db72 to b145c20 Compare July 26, 2024 15:35

Merge branch 'release' into sync-release-with-main

b145c20

dtrifiro merged commit 849f0f5 into release Jul 26, 2024
0 of 4 checks passed

dtrifiro deleted the sync-release-with-main branch July 26, 2024 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync release with main for RHOAI 2.12 #110

Sync release with main for RHOAI 2.12 #110

dtrifiro commented Jul 26, 2024 •

edited

Loading

openshift-ci bot commented Jul 26, 2024

Sync release with main for RHOAI 2.12 #110

Sync release with main for RHOAI 2.12 #110

Conversation

dtrifiro commented Jul 26, 2024 • edited Loading

openshift-ci bot commented Jul 26, 2024

dtrifiro commented Jul 26, 2024 •

edited

Loading