Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finer-grained control over fp16/fp8 builds #722

Merged
merged 1 commit into from
Jan 8, 2025

Conversation

nandor
Copy link
Contributor

@nandor nandor commented Jan 7, 2025

Flags can be used to disable fp16 and either of the fp8 variants in order to speed up AOT builds.

By default, the configuration remains unchanged and the FLASHINFER_ENABLE_FP8 flag will enable both fp8 modes.

Flags can be used to disable fp16 and either of the fp8 variants in order to speed up AOT builds.
Copy link
Collaborator

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@yzh119 yzh119 merged commit 13de896 into flashinfer-ai:main Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants