Skip to content

Conversation

@ckhordiasma
Copy link

LucasWilkinson and others added 30 commits April 17, 2025 22:13
…m-project#16801)

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: rongfu.leng <rongfu.leng@daocloud.io>
Signed-off-by: Luka Govedič <lgovedic@redhat.com>
Signed-off-by: Lu Fang <fanglu@fb.com>
…16796)

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Jonghyun Choe <andy.choe729@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…llm-project#16829)

Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
…nfig info (vllm-project#16857)

Signed-off-by: jmho <jaylenho734@gmail.com>
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
…ect#15130)

Signed-off-by: fyabc <suyang.fy@alibaba-inc.com>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Xiong Wang <wangxiongts@163.com>
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
…llm-project#16591)

Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
… V1 (vllm-project#15477)

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Signed-off-by: Staszek Pasko <staszek@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
dtrifiro and others added 28 commits April 28, 2025 15:50
- remove build steps/dependencies
- allow for installing pre-built flash-attention/vllm wheels
- default ROCM_VERSION to 6.3.4, allowing ovverride with env vars
- cleanup rocm docker bake, defaults
- amdsmi: use setup.py to build
- add amdsmi bind mount
- remove flashinfer from rocm target
- bump vllm-tgis-adapter to 0.7.0
- Dockerfile*.ubi: bump ubi base
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
- remove build steps/dependencies
- allow for installing pre-built flash-attention/vllm wheels
- default ROCM_VERSION to 6.3.4, allowing ovverride with env vars
- cleanup rocm docker bake, defaults
- amdsmi: use setup.py to build
- add amdsmi bind mount
- remove flashinfer from rocm target
- bump vllm-tgis-adapter to 0.7.0
- Dockerfile*.ubi: bump ubi base
…-project#17303)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…vllm-project#17255)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…rides are ordered (vllm-project#17256)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…17197)

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
…t have shape (metadata_size) (vllm-project#17283)

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…_after_loading`. (vllm-project#16854)

Signed-off-by: charlifu <charlifu@amd.com>
Signed-off-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: andy-neuma <andy@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
…ct results (vllm-project#17574)

Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
…client' (vllm-project#17434)

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Rahul Tuli <rtuli@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Syncing midstream NM fork to Upstream tag of
[v0.8.5.post1](https://github.com/vllm-project/vllm/tree/v0.8.5.post1) +
cherry pick of
vllm-project@be633fb
needed for benchmarks +
[CP](neuralmagic/nm-vllm-ent@1fe447d)
for compressed tensor bump +
[CP](vllm-project#17677) for lora on AMD +
[CP](vllm-project#17315) for llama4 w/ pure
dense layers

```
commit 31c73ba (HEAD -> upstream-v0.8.5, nm-fork/upstream-v0.8.5)
Author: Chauncey <chaunceyjiang@gmail.com>
Date:   Wed Apr 30 15:11:04 2025 +0800

    [Bugfix] Fix AttributeError: 'State' object has no attribute 'engine_client' (vllm-project#17434)
    
    Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

commit f8db0bd
Author: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Date:   Fri May 2 14:01:38 2025 -0400

    [BugFix][Attention] Fix sliding window attention in V1 giving incorrect results (vllm-project#17574)
    
    Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>

commit e335c34
Author: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Date:   Fri May 2 04:07:03 2025 -0400

    [BugFix] Fix Memory Leak (vllm-project#17567)
    
    Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

commit cc463fe
Merge: 1e358ff ba41cc9
Author: Selbi Nuryyeva <selbi@redhat.com>
Date:   Tue Apr 29 12:34:57 2025 -0400

    Merge branch 'tag-upstream-v0.8.5' into upstream-v0.8.5

commit ba41cc9 (tag: v0.8.5, tag-upstream-v0.8.5)
Author: Michael Goin <mgoin64@gmail.com>
Date:   Mon Apr 28 16:20:24 2025 -0600

    [Model] Add tuned triton fused_moe configs for Qwen3Moe (vllm-project#17328)
    
    Signed-off-by: mgoin <mgoin64@gmail.com>

commit dcbac4c
Author: Simon Mo <simon.mo@hey.com>
Date:   Mon Apr 28 14:12:01 2025 -0700

    [Model] Qwen3 Dense FP8 Compat Fixes (vllm-project#17318)
    
    Signed-off-by: simon-mo <xmo@berkeley.edu>
[...]
```

Commands
```
git fetch upstream
git checkout -b upstream-v0.8.5
git merge upstream/v0.8.5
git cherry-pick be633fb
```

TEST PLAN
accept sync:
https://github.com/neuralmagic/nm-cicd/actions/runs/14841223552
related PR in cicd: neuralmagic/nm-cicd#99
release workflow:
https://github.com/neuralmagic/nm-cicd/actions/runs/14845693864
This bumps the cuda version in the base layer to 12-8 instead of 12-4.
This could break something if during dep install
we have to build a dependency from source, as the wheels we bring in
later in prepare are now being built against 12.8.

FIX #xxxx (*link existing issues this PR will resolve*)

<!--- pyml disable-next-line no-emphasis-as-heading -->
**BEFORE SUBMITTING, PLEASE READ
<https://docs.vllm.ai/en/latest/contributing/overview.html>** (anything
written below this line will be removed by GitHub Actions)
notable conflicts were in Dockerfile.rocm.ubi and Dockerfile.ubi

Up to date with Upstream v0.8.5.post1 tag and includes CPs for lora, llama4, compressed tensors bump
@ckhordiasma ckhordiasma merged commit 60c92f8 into main May 15, 2025
3 of 4 checks passed
@ckhordiasma ckhordiasma deleted the nm-vllm-ent-0.8.5-sync branch May 15, 2025 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.