[CI/Build] Update the Dockerfile to include Blackwell archs #18092

0xjunhao · 2025-05-13T18:31:59Z

Update the Dockerfile to include Blackwell archs and use ubuntu 24.04 as base image

Have tested that this works for RTX5090.
Test build FYI: ubicloud/vllm-openai:latest

github-actions · 2025-05-13T18:32:08Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

simon-mo

LGTM! Let's see if it builds, and i'll poll contributor on the ubuntu 24 change

docker/Dockerfile

Seems like Ubuntu24 upgrade might break people's workflow

simon-mo · 2025-05-13T19:07:23Z

Is it possible to do this without updating base image?

tlrmchlsmth

We need to stick to building with older OSes, unfortunately. The reason for this is that the glibc version is forwards-compatible but not backwards-compatible. If we upgrade to Ubuntu 24.04, then vLLM won't work on 22.04, for instance.

Can the CUDA arch list be extended without upgrading Ubuntu?

0xjunhao · 2025-05-13T19:25:01Z

I think so. Cuda 12.8 still supports 20.04 fortunately. There are two versions there, the base was on 20.04, and vllm-base was on 22.04. Should I revert both?

0xjunhao · 2025-05-13T19:38:46Z

FYI ubuntu 20.04 will reach its EOL on May 31, 2025, which is only two weeks away.

tlrmchlsmth · 2025-05-13T20:02:47Z

FYI ubuntu 20.04 will reach its EOL on May 31, 2025, which is only two weeks away.

I think we should consider the OS upgrade separately. Upgrading to Ubuntu 22.04 will put us on glibc 2.35.

This will break vLLM on, for instance:

Debian bullseye, which is on 2.31 and not EoL until Aug 2026
UBI 9, which is on 2.34

Signed-off-by: Junhao Li <junhao@ubicloud.com>

0xjunhao · 2025-05-14T13:26:38Z

It seems that with the new archs, the image build check is timing out during the FlashInfer stage. Is there a way to increase the timeout limit?

alew3 · 2025-05-16T14:35:41Z

@0xjunhao I tried running the model OpenGVLab/InternVL3-1B-Instruct on an RTX 5090 with the dockerfile ubicloud/vllm-openai:latest and got this error

CUDA error (/__w/xformers/xformers/third_party/flash-attention/hopper/flash_fwd_launch_template.h:175): no kernel image is available for execution on the device

cchadowitz · 2025-05-16T14:59:32Z

@0xjunhao I tried running the model OpenGVLab/InternVL3-1B-Instruct on an RTX 5090 with the dockerfile ubicloud/vllm-openai:latest and got this error

CUDA error (/__w/xformers/xformers/third_party/flash-attention/hopper/flash_fwd_launch_template.h:175): no kernel image is available for execution on the device

I believe I could work around this by building the latest xformers from source.

0xjunhao · 2025-05-16T18:44:08Z

Discussed with Simon, closing this PR. Please refer to PR #18095.

mergify bot added the ci/build label May 13, 2025

0xjunhao force-pushed the archs branch from 4e8eef3 to a3ef843 Compare May 13, 2025 18:35

simon-mo previously approved these changes May 13, 2025

View reviewed changes

mgoin reviewed May 13, 2025

View reviewed changes

docker/Dockerfile Outdated Show resolved Hide resolved

0xjunhao force-pushed the archs branch 2 times, most recently from 1bec9dc to ecbded0 Compare May 13, 2025 18:55

mergify bot added the documentation Improvements or additions to documentation label May 13, 2025

tlrmchlsmth requested changes May 13, 2025

View reviewed changes

mgoin mentioned this pull request May 13, 2025

Update Dockerfile to build for Blackwell #18095

Merged

tlrmchlsmth mentioned this pull request May 13, 2025

[CI/Build] Allow hermetic builds #18064

Merged

0xjunhao force-pushed the archs branch from ecbded0 to 1ee870c Compare May 13, 2025 20:19

0xjunhao requested a review from tlrmchlsmth May 13, 2025 20:20

0xjunhao changed the title ~~[CI/Build] Update the Dockerfile to include Blackwell archs and use ubuntu 24.04 as base image~~ [CI/Build] Update the Dockerfile to include Blackwell archs May 13, 2025

0xjunhao force-pushed the archs branch 2 times, most recently from 075ba43 to 812fda5 Compare May 13, 2025 23:09

Add archs for blackwell chips and upgrade base image

a149a08

Signed-off-by: Junhao Li <junhao@ubicloud.com>

0xjunhao force-pushed the archs branch from 812fda5 to a149a08 Compare May 13, 2025 23:59

0xjunhao closed this May 16, 2025

Uh oh!

[CI/Build] Update the Dockerfile to include Blackwell archs #18092

[CI/Build] Update the Dockerfile to include Blackwell archs #18092

Uh oh!

Conversation

0xjunhao commented May 13, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 13, 2025

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

simon-mo commented May 13, 2025

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

0xjunhao commented May 13, 2025

Uh oh!

0xjunhao commented May 13, 2025

Uh oh!

tlrmchlsmth commented May 13, 2025

Uh oh!

0xjunhao commented May 14, 2025

Uh oh!

alew3 commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cchadowitz commented May 16, 2025

Uh oh!

0xjunhao commented May 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

0xjunhao commented May 13, 2025 •

edited by github-actions bot

Loading

alew3 commented May 16, 2025 •

edited

Loading