-
Notifications
You must be signed in to change notification settings - Fork 529
ci/cd: bringup flashinfer-jit-cache package #1726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci/cd: bringup flashinfer-jit-cache package #1726
Conversation
Summary of ChangesHello @yzh119, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the Ahead-of-Time (AOT) compilation and packaging of FlashInfer modules by creating a dedicated, standalone Python package. This change aims to improve modularity, streamline the distribution of pre-compiled kernels, and provide a more robust and isolated build process for these performance-critical components. The core compilation logic has been centralized and integrated into the new package's build system. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new package flashinfer-aot-modules
to hold the Ahead-Of-Time (AOT) compiled modules, separating them from the main flashinfer
package. This is a good step towards better modularity. The changes include a new pyproject.toml
and setup.py
for the new package with custom build logic to compile the modules. The flashinfer/aot.py
script has been refactored to centralize the compilation logic into a new compile_and_package_modules
function, which improves code reuse and clarity.
My review focuses on improving code clarity and fixing a potential bug in the command-line usage of the AOT script. I've suggested using more modern pathlib
features and ensuring consistency in fallback version strings. Most importantly, I've identified and proposed a fix for an issue where compilation paths might not be set correctly when using a custom build directory, which could break the build.
91e6471
to
390b079
Compare
…ashinfer-dev into flashinfer-aot-modules-packing
📌 Description
Move aot modules to a standalone package
flashinfer-jit-cache
.This package, together with
flashinfer-cubin
, are two additional packages to the coreflashinfer-python
packge.When only the
flashinfer-python
, every kernel will be jit compiled or downloaded from artifacory.If these two additional packages are installed, most of the pre-compiled kernels/cubins are be loaded from these wheels, without the need of JIT compile/download.
🔍 Related Issues
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commit
by runningpip install pre-commit
(or used your preferred method).pre-commit install
.pre-commit run --all-files
and fixed any reported issues.🧪 Tests
unittest
, etc.).Reviewer Notes