Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customizable SM90 prefill kernels. #704

Merged
merged 6 commits into from
Dec 29, 2024
Merged

Conversation

hyhieu
Copy link
Contributor

@hyhieu hyhieu commented Dec 28, 2024

Added customizable SM90 prefill kernels.

AOT command:

MAX_JOBS=224 \
FLASHINFER_ENABLE_AOT=1 \
TORCH_CUDA_ARCH_LIST=9.0a \
pip install -e . \
  --no-cache-dir --ignore-installed --verbose --force-reinstall

JIT command:

MAX_JOBS=224 \
FLASHINFER_ENABLE_AOT=0 \
TORCH_CUDA_ARCH_LIST=9.0a \
pip install -e . \
  --no-cache-dir --ignore-installed --verbose --force-reinstall

After either command, you can make sure the SM90 kernels work by:

pytest -x tests/test_hopper.py

An example for a customizable_sm90_prefill_kernel has also been added to `tests/test_jit_example.py". You can follow that example to add custom kernels.

@yzh119 yzh119 self-requested a review December 28, 2024 23:18
Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hyhieu Thanks so much for the contribution! It works great in my environment, left some suggestions.

flashinfer/jit/attention.py Outdated Show resolved Hide resolved
flashinfer/jit/attention.py Outdated Show resolved Hide resolved
flashinfer/jit/batch_prefill_sm90_templ.py Outdated Show resolved Hide resolved
flashinfer/jit/batch_prefill_sm90_templ.py Outdated Show resolved Hide resolved
flashinfer/jit/single_prefill_sm90_templ.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@yzh119 yzh119 merged commit 4ba91c0 into flashinfer-ai:main Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants