-
Notifications
You must be signed in to change notification settings - Fork 334
[Bugfix] Fix autotune cache #1315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the TileLang project. Please remember to run We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀 |
WalkthroughThe pull request refactors kernel persistence in the autotuner by introducing file-path constants, atomic write utilities, and reworked save/load logic to separately manage device and host kernels with backend-specific handling for nvrtc and tvm_ffi. Changes
Sequence DiagramsequenceDiagram
participant User
participant AutotuneResult
participant FileSystem
participant Backend
rect rgb(230, 245, 255)
Note over AutotuneResult,FileSystem: Save Flow (New)
User->>AutotuneResult: save_to_disk()
AutotuneResult->>AutotuneResult: _safe_write_file() for best_config
AutotuneResult->>AutotuneResult: _safe_write_file() for function & latency
alt nvrtc backend
AutotuneResult->>AutotuneResult: _safe_write_file() for device_kernel.cu
AutotuneResult->>AutotuneResult: _safe_write_file() for host_kernel.cu
AutotuneResult->>FileSystem: Save kernel.cubin (binary)
AutotuneResult->>FileSystem: Save kernel.py (helper)
else tvm_ffi backend
AutotuneResult->>AutotuneResult: _safe_write_file() for kernel sources
AutotuneResult->>AutotuneResult: _safe_write_executable() for executable.so
end
AutotuneResult->>AutotuneResult: _safe_write_file() for parameters
end
rect rgb(245, 230, 255)
Note over User,Backend: Load Flow (New)
User->>AutotuneResult: load_from_disk()
AutotuneResult->>AutotuneResult: Normalize target
AutotuneResult->>Backend: Resolve execution backend
AutotuneResult->>AutotuneResult: _load_kernel_from_disk(backend, target, compile_flags)
AutotuneResult->>FileSystem: Load device_kernel.cu + host_kernel.cu
AutotuneResult->>FileSystem: Load parameters
alt nvrtc backend
AutotuneResult->>FileSystem: Load kernel.cubin
else tvm_ffi backend
AutotuneResult->>FileSystem: Load executable.so
end
AutotuneResult-->>User: Return AutotuneResult instance
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tilelang/autotuner/param.py (1)
302-316: Consider checking source file existence upfront.The existence check at line 315 only verifies
kernel_lib_pathandparams_path, but line 349 requires bothdevice_kernel_sourceandhost_kernel_sourceto be loaded. If source files are missing, the method will silently returnNoneeven though the kernel library exists.Consider adding source files to the existence check for clearer early detection:
- if not all([os.path.exists(file) for file in (kernel_lib_path, params_path)]): + required_files = [device_kernel_path, host_kernel_path, kernel_lib_path, params_path] + if not all([os.path.exists(file) for file in required_files]): + if verbose: + missing = [f for f in required_files if not os.path.exists(f)] + logger.debug(f"Cannot load kernel from {cache_path}: missing files {missing}") return None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tilelang/autotuner/param.py(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tilelang/autotuner/param.py (4)
tilelang/cache/kernel_cache.py (3)
_load_binary(244-247)_safe_write_file(250-257)_safe_write_executable(260-263)tilelang/jit/kernel.py (5)
export_library(613-630)kernel_source(606-607)params(602-603)from_database(143-183)out_idx(598-599)tilelang/utils/target.py (1)
determine_target(62-123)tilelang/jit/execution_backend.py (1)
resolve_execution_backend(62-100)
🪛 Ruff (0.14.5)
tilelang/autotuner/param.py
204-204: Do not catch blind exception: Exception
(BLE001)
219-219: Do not catch blind exception: Exception
(BLE001)
328-328: Do not catch blind exception: Exception
(BLE001)
337-337: Do not catch blind exception: Exception
(BLE001)
346-346: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Quick Lint
🔇 Additional comments (12)
tilelang/autotuner/param.py (12)
27-34: LGTM! Constants properly aligned with cache layer.The new file path constants standardize kernel artifact naming across the codebase and provide clear backend-specific identifiers.
153-157: LGTM! Binary loader implementation is correct.The implementation matches the reference in
kernel_cache.pyand correctly reads binary files.
159-168: LGTM! Atomic write implementation is robust.The use of temporary files with
os.replaceensures atomic writes and prevents partial file corruption during concurrent operations. The explicitmakedirsfor the temp directory adds good defensive programming.
170-176: LGTM! Executable export implementation is correct.The method safely exports TVM executables using atomic file operations, correctly matching the pattern established in
kernel_cache.py.
196-220: LGTM! Device and host kernel source saving is well-implemented.The atomic writes and backend-specific host source selection (TVM FFI uses
get_host_source(), others useget_kernel_source()) correctly align with the framework's architecture.
261-268: LGTM! Parameter serialization uses atomic writes.The kernel parameters are now safely persisted using atomic write operations.
270-278: LGTM! Signature correctly extended with compile_flags.The
compile_flagsparameter addition aligns with theJITKernel.from_databasesignature and enables proper kernel reconstruction from cache.
318-348: LGTM! Source loading correctly handles device and host kernels.The separate loading of device and host kernel sources with appropriate error handling enables proper reconstruction of backend-specific kernels.
349-364: LGTM! Kernel reconstruction correctly requires all artifacts.The strict check ensuring both source files and parameters are loaded is appropriate, as
JITKernel.from_databaserequires all these inputs.
366-395: LGTM! All config and metadata writes are now atomic.Converting all file writes in
save_to_diskto use_safe_write_fileprevents partial writes and improves reliability during concurrent autotuning operations.
403-409: LGTM! Target normalization and backend resolution are correct.The explicit normalization of target and resolution of execution backend ensure kernels are loaded with the correct backend configuration.
437-437: Verify whether compile_flags should be persisted to cache.Line 437 passes
Noneforcompile_flagswith a comment noting they are "not tracked here". Ifcompile_flagsaffect kernel compilation behavior, not persisting them could cause kernels loaded from cache to behave differently than originally compiled.Run this script to check if compile_flags are used during kernel compilation:
If compile_flags significantly affect kernel behavior, consider persisting them alongside other kernel metadata.
| # Save kernel library (backend-specific) | ||
| try: | ||
| kernel_lib_path = os.path.join(cache_path, KERNEL_LIB_PATH) | ||
| src_lib_path = kernel.adapter.libpath | ||
| if verbose: | ||
| logger.debug(f"Saving kernel library to file: {kernel_lib_path}") | ||
| shutil.copy(src_lib_path, kernel_lib_path) | ||
| if kernel.execution_backend == "nvrtc": | ||
| kernel_lib_file = KERNEL_CUBIN_PATH | ||
| elif kernel.execution_backend == "tvm_ffi": | ||
| kernel_lib_file = EXECUTABLE_PATH | ||
| else: | ||
| kernel_lib_file = KERNEL_LIB_PATH | ||
|
|
||
| kernel_lib_path = os.path.join(cache_path, kernel_lib_file) | ||
|
|
||
| if kernel.execution_backend == "nvrtc": | ||
| # Save cubin and python helper file | ||
| src_lib_path = kernel.adapter.libpath | ||
| kernel_py_path = os.path.join(cache_path, KERNEL_PY_PATH) | ||
| py_src_path = src_lib_path.replace(".cubin", ".py") | ||
| if verbose: | ||
| logger.debug(f"Saving kernel nvrtc python code to file: {kernel_py_path}") | ||
| self._safe_write_file(kernel_py_path, "wb", | ||
| lambda f: f.write(self._load_binary(py_src_path))) | ||
| if verbose: | ||
| logger.debug(f"Saving kernel library to file: {kernel_lib_path}") | ||
| self._safe_write_file(kernel_lib_path, "wb", | ||
| lambda f: f.write(self._load_binary(src_lib_path))) | ||
| elif kernel.execution_backend == "tvm_ffi": | ||
| executable = kernel.adapter.executable | ||
| if verbose: | ||
| logger.debug(f"Saving kernel executable to file: {kernel_lib_path}") | ||
| self._safe_write_executable(executable, kernel_lib_path) | ||
| else: | ||
| src_lib_path = kernel.adapter.libpath | ||
| if verbose: | ||
| logger.debug(f"Saving kernel library to file: {kernel_lib_path}") | ||
| self._safe_write_file(kernel_lib_path, "wb", | ||
| lambda f: f.write(self._load_binary(src_lib_path))) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify nvrtc Python helper file exists before reading.
The nvrtc path derives the Python helper file path using string replacement:
py_src_path = src_lib_path.replace(".cubin", ".py")Two concerns:
- String replacement could match
.cubinanywhere in the path, not just the extension - No verification that
py_src_pathexists before calling_load_binary(py_src_path)at line 241
Consider this more robust approach:
- src_lib_path = kernel.adapter.libpath
- kernel_py_path = os.path.join(cache_path, KERNEL_PY_PATH)
- py_src_path = src_lib_path.replace(".cubin", ".py")
+ src_lib_path = kernel.adapter.libpath
+ kernel_py_path = os.path.join(cache_path, KERNEL_PY_PATH)
+ # Use splitext + join to avoid matching .cubin in directory names
+ base_path = os.path.splitext(src_lib_path)[0]
+ py_src_path = base_path + ".py"
+
+ # Verify py file exists before reading
+ if not os.path.exists(py_src_path):
+ logger.warning(f"Python helper file not found: {py_src_path}")
+ else:| kernel = cls._load_kernel_from_disk( | ||
| cls, | ||
| path, | ||
| norm_target, | ||
| compile_args.target_host, | ||
| compile_args.out_idx, | ||
| resolved_backend, | ||
| compile_args.pass_configs, | ||
| None, # compile_flags not tracked here | ||
| func, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Clarify the method invocation pattern for _load_kernel_from_disk.
At line 430, cls is passed as the first argument to cls._load_kernel_from_disk(cls, ...). However, _load_kernel_from_disk is defined as an instance method with self as the first parameter (line 270). This works because the method doesn't actually use any instance state, but it creates confusion.
Consider converting _load_kernel_from_disk to a static method for clarity:
+ @staticmethod
def _load_kernel_from_disk(
- self,
cache_path: Path,
target: str | Target = "auto",Then call it as:
- kernel = cls._load_kernel_from_disk(
- cls,
+ kernel = AutotuneResult._load_kernel_from_disk(
path,Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In tilelang/autotuner/param.py around lines 429-439, the call
cls._load_kernel_from_disk(cls, ...) passes cls as the first argument to a
method currently defined as an instance method (self) which is confusing;
convert _load_kernel_from_disk to a staticmethod (add @staticmethod above its
definition and remove the self parameter or rename it to match actual args) so
it does not expect an instance, then update the call site to
cls._load_kernel_from_disk(path, norm_target, compile_args.target_host,
compile_args.out_idx, resolved_backend, compile_args.pass_configs, None, func)
(i.e., stop passing cls as the first arg) and ensure any internal uses that
referenced self are adjusted to use explicit parameters or class-level
references as needed.
This pull request refactors the kernel caching logic in
tilelang/autotuner/param.pyto improve atomicity, consistency, and backend support for saving and loading compiled kernels and related artifacts. It introduces safe file writing methods, aligns file naming conventions with the cache system, and enhances support for multiple execution backends. These changes help prevent data corruption, simplify cache management, and make backend-specific handling more robust.Atomic and Safe File Operations
_safe_write_fileand_safe_write_executablestatic methods to ensure atomic file writes, preventing partial writes and improving reliability during concurrent operations.save_to_diskand kernel save/load routines to use these atomic methods for configs, functions, latencies, and parameters. [1] [2]Consistent File Naming and Backend Support
device_kernel.cu,host_kernel.cu,executable.so, etc.) to match the conventions incache/kernel_cache.py, and made kernel library file selection backend-aware. [1] [2] [3].cubinand.py, TVM FFI's.so), including correct handling of host/device sources and executable export. [1] [2]Kernel Load/Save API Improvements
_load_kernel_from_diskand related calls to support new file naming, backend resolution, and compile flags, ensuring correct reconstruction ofJITKernelobjects from disk. [1] [2] [3]load_from_diskfor robust handling of different kernel compilation scenarios.These changes make the autotuner's disk caching more reliable, backend-aware, and maintainable.
Summary by CodeRabbit
Bug Fixes
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.