Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install flash_attn in Docker image #3396

Merged
merged 2 commits into from
Mar 14, 2024

Conversation

tdoublep
Copy link
Member

@tdoublep tdoublep commented Mar 14, 2024

Recent fix #3269 removed flash_attn as an explicit dependency since it was breaking builds in a bunch of environments and made the wheel size larger.

This PR leaves the wheel unchanged, but installs flash_attn independently within the Docker build (for both the test as well as the runtime image). This will allow us to use the FlashAttentionBackend in containerized environments without affecting those using the package in other ways.

@tdoublep tdoublep changed the title Install flash-attn in Docker image Install flash_attn in Docker image Mar 14, 2024
@simon-mo simon-mo merged commit 06ec486 into vllm-project:main Mar 14, 2024
24 checks passed
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants