Skip to content

Conversation

@Ubospica
Copy link
Collaborator

@Ubospica Ubospica commented Nov 17, 2025

This PR enhances the TvmFfiBuilder:

  1. Rename: TVMFFIBuilder -> TvmFfiBuilder (reverted by refactor: rename TvmFfiBuilder -> TVMFFIBuilder #118)
  2. Make it thread and process safe
  3. Refactor structure and add docs
  4. Add check for absolute paths and parent paths in Solution

This PR removes Python grammar check from the Pydantic model validation side. Instead, such advanced check should be performed in the compiler/builder. cc @xslingcn

Signed-off-by: Ubospica ubospica@gmail.com

Summary by CodeRabbit

  • New Features

    • Multi-process safe compilation with file-locking, caching of compiled artifacts, and language-aware build handling (CPU/GPU).
  • Bug Fixes

    • Enforced single-device input validation for tensor operations and ensured outputs are allocated to the validated device.
    • Strengthened source path validation to block absolute paths and path traversal.
  • Chores

    • Naming consistency updates and metadata enhancements for builds.
    • Updated test/config tooling and improved test coverage for caching and cross-build scenarios.

Signed-off-by: Ubospica <ubospica@gmail.com>
Signed-off-by: Ubospica <ubospica@gmail.com>
Signed-off-by: Ubospica <ubospica@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Renames TVMFFIBuilderTvmFfiBuilder, adds language detection (CUDA/CPP), file-locked cache-aware build flow with source persistence, strengthens entry-point and source-path validation, enforces single-device inputs in TVM runnable, updates tests and pytest/setuptools_scm config.

Changes

Cohort / File(s) Summary
Exports & registry
flashinfer_bench/compile/builders/__init__.py, flashinfer_bench/compile/registry.py
Updated imports/exports and registry instantiation to use TvmFfiBuilder instead of TVMFFIBuilder.
TVM FFI builder
flashinfer_bench/compile/builders/tvm_ffi_builder.py
Renamed class to TvmFfiBuilder; added Language enum (CUDA/CPP); introduced cache validation, file-lock synchronization, source writing, language-aware metadata (language, binary); improved entry-symbol validation and build flow returning TVMFFIRunnable.
Runnable behavior
flashinfer_bench/compile/runnable.py
Enforces single-device input for TVM runnable; outputs moved to validated device; removed close() from TVM runnable.
Solution validation
flashinfer_bench/data/solution.py
Added Path-based security checks to reject absolute paths and path traversal ("..") in source path entries; duplicate-path detection preserved.
Tests
tests/compile/test_tvm_ffi_builder.py, tests/data/test_solution.py
Tests updated to TvmFfiBuilder; added isolated_cache fixture; introduced CPP/CUDA source constants; expanded cache/subdirectory/cross-builder scenarios; added path traversal and absolute-path tests.
Project config
pyproject.toml
tool.pytest.ini_options.addopts updated to -rA --durations=0 --ignore=3rdparty; added tool.setuptools_scm.fallback_version = "0.0.0.dev0".

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Builder as TvmFfiBuilder
    participant Lock as FileLock
    participant Cache
    participant Compiler as tvm_ffi.cpp.build
    participant Runnable

    Caller->>Builder: _build(sources, entry_symbol)
    Builder->>Builder: _detect_language(sources)
    Builder->>Cache: check cache metadata & binary
    alt valid cached artifact
        Cache-->>Builder: cached .so available
        Builder->>Runnable: create runnable from cached metadata
        Builder-->>Caller: return runnable
    else cache miss or stale
        Builder->>Lock: acquire(lock_path)
        Lock-->>Builder: lock acquired
        Builder->>Builder: _write_sources(sources)
        Builder->>Builder: _get_entry_symbol(entry_symbol)
        Builder->>Compiler: compile sources -> .so
        Compiler-->>Builder: output_lib_path
        Builder->>Builder: write metadata (language, binary)
        Builder->>Runnable: _make_runnable(metadata)
        Lock-->>Builder: release
        Builder-->>Caller: return runnable
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Pay special attention to tvm_ffi_builder.py for correctness of lock semantics, cache validation, and race conditions.
  • Review runnable.py for device-selection consequences and removal of close() (resource lifecycle).
  • Validate solution.py path-security checks for false positives/negatives.
  • Verify tests (tests/compile/test_tvm_ffi_builder.py, tests/data/test_solution.py) match new behaviors and reliably isolate cache.

Possibly related PRs

  • [tvm-ffi] TVMFFIBuilder #111 — Modifies the TVM FFI builder and registry exports; likely directly related to the builder rename/implementation differences.

Poem

🐰 I hopped through source and folder tree,
Locks clutched tight, a cache for me,
TvmFfiBuilder hums, compiles with care,
Languages sorted, binaries there.
A fluffy hop — reproducible flair! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 89.66% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title 'refactor: Enhance TvmFfiBuilder' clearly and specifically summarizes the main change—renaming and enhancing the TvmFfiBuilder class with thread/process safety and structural improvements.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Ubospica, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the TvmFfiBuilder to provide a more robust, thread-safe, and efficient compilation and caching experience. The primary goal is to ensure that the builder can reliably handle concurrent build requests from multiple processes or threads, leveraging an improved caching strategy to minimize redundant compilations. Additionally, the changes include a renaming of the builder class for better Pythonic adherence and refinements to source file handling and language detection.

Highlights

  • Renaming: The TVMFFIBuilder class has been renamed to TvmFfiBuilder for consistency with Python naming conventions. This change is reflected across all relevant files, including imports and instantiations.
  • Thread and Process Safety: The TvmFfiBuilder now incorporates tvm_ffi.utils.FileLock to ensure thread and process safety during compilation and caching. This prevents race conditions when multiple processes or threads attempt to build or access the same cached module simultaneously.
  • Enhanced Caching Logic: The caching mechanism has been significantly improved. The builder now performs a more robust check (_can_use_cached) to verify if a compiled shared library (.so) can be reused, ensuring that both the .so file and its corresponding source files exist and are identical to the requested solution. A double-check is performed after acquiring the file lock to handle scenarios where another process might have completed the build.
  • Source Language Detection: A new Language enum and _detect_language method have been introduced to explicitly classify source files as CUDA or C++ based on their extensions, improving clarity and potential future extensibility.
  • Improved Source File Handling: The _write_sources method now correctly handles source files located in subdirectories by ensuring parent directories are created as needed. This allows for more organized project structures within solutions.
  • Test Suite Enhancements: The test suite for TvmFfiBuilder has been updated to reflect the renaming and new functionalities. It includes a new isolated_cache fixture to prevent test interference, and new tests specifically for caching behavior across different builder instances and for source files in subdirectories.
  • Runnable Cleanup: The close method has been removed from TVMFFIRunnable as TVM-FFI handles resource cleanup automatically, simplifying the runnable interface.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Signed-off-by: Ubospica <ubospica@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the TvmFfiBuilder by renaming it to conform to PEP 8 and, more importantly, making it thread and process-safe using a file-based lock. The caching logic is now more robust, checking source file content to prevent stale builds, and the support for source files in subdirectories is a great addition. The code is also much more readable with extensive docstrings and improved test coverage.

I've identified a critical path traversal vulnerability in how source file paths are handled. An attacker could craft a Solution with malicious paths to read from or write to arbitrary files on the system. I've left two comments with suggestions to fix this vulnerability in both _can_use_cached and _write_sources methods. Addressing this is crucial for security.

Comment on lines +159 to +160

# Check if all files exist and content is identical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This loop is vulnerable to a path traversal attack. If a Solution object is crafted with a path containing .. or an absolute path in sol.sources[...].path, this code will read from arbitrary locations on the filesystem, potentially leaking sensitive information. This is a critical security risk.

You should validate that all source paths are relative and do not contain .. components before using them.

        for src in sol.sources:
            if Path(src.path).is_absolute() or ".." in Path(src.path).parts:
                raise BuildError(
                    f"Invalid source path (absolute or path traversal): {src.path}"
                )
            src_path = path / src.path

Comment on lines 57 to +231
cuda_files: List[str] = []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This loop is vulnerable to a path traversal attack. If a Solution object is crafted with a path containing .. or an absolute path in sol.sources[...].path, this code will write to arbitrary locations on the filesystem. This is a critical security risk.

You should validate that all source paths are relative and do not contain .. components before using them. A good place for this validation is at the beginning of the loop.

        for src in sol.sources:
            if Path(src.path).is_absolute() or ".." in Path(src.path).parts:
                raise BuildError(
                    f"Invalid source path (absolute or path traversal): {src.path}"
                )
            src_path = path / src.path

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
flashinfer_bench/compile/builders/__init__.py (1)

4-6: Public rename to TvmFfiBuilder looks consistent; consider compatibility alias

Import and __all__ now consistently expose TvmFfiBuilder. If any external users depend on TVMFFIBuilder, you might optionally re-export an alias (TVMFFIBuilder = TvmFfiBuilder) for a deprecation window instead of a hard break.

tests/compile/test_tvm_ffi_builder.py (3)

1-2: Update test module docstring to match TvmFfiBuilder

The tests now target TvmFfiBuilder, but the module docstring still says “TVMFFIBuilder”. Consider updating it to avoid confusion.


93-132: CPU/GPU build-and-run tests align well with the add-one kernels

The test_build_cpp_cpu and test_build_cuda_gpu cases exercise both C++ and CUDA paths through TvmFfiBuilder, validating that the resulting runnables interoperate correctly with torch tensors. Minor nit: the inline comment “TVMFFIBuilder accepts CUDA language” could be renamed to “TvmFfiBuilder” for consistency.

Also applies to: 139-180


225-272: Caching tests exercise behavior but don’t strictly assert cache hits

Both test_caching_builder_level and test_caching_cross_builder validate that repeated builds (same builder vs different builders) produce consistent results, which is useful. However, they would still pass if the second build recompiled instead of hitting the cache. If you want stronger guarantees, consider monkeypatching tvm_ffi.cpp.build to count calls or assert that it’s invoked only once per unique key.

Also applies to: 274-321

flashinfer_bench/compile/builders/tvm_ffi_builder.py (3)

173-205: Make language detection case-insensitive (optional)

_detect_language currently uses str.endswith on the raw path string, so a file named kernel.CU or KERNEL.CU would not be recognised as CUDA.

Not critical, but you could strengthen this with:

-        for src in sol.sources:
-            path_str = str(src.path)
+        for src in sol.sources:
+            path_str = str(src.path).lower()

before the endswith(...) checks.


318-395: Caching and locking flow in _build is correct; consider per-key lock names

The build flow (pre-check cache → mkdir → acquire lock → re-check cache → build → load module → wrap) looks correct and multi-process safe, and the metadata dictionary is nicely structured.

One thing you might consider for throughput: _LOCK_FILE_NAME is a single fixed filename, which serialises all builds for this builder. Using a per-key lock (e.g. f"{key}.lock") would allow concurrent compilation of different solutions while still guarding each key against races:

-            with FileLock(build_path / self._LOCK_FILE_NAME):
+            with FileLock(build_path / f"{key}.lock"):

151-151: Long error messages vs. Ruff TRY003 (optional)

Ruff is flagging the longer BuildError messages (TRY003) in this file. The current messages are actually quite readable, so this is stylistic rather than functional.

If you want to satisfy Ruff, options include:

  • Shortening inline messages, or
  • Moving the detailed text into class-level constants or a dedicated exception subclass.

Given the clarity benefits, it may also be reasonable to keep the current style and adjust lints instead.

Also applies to: 165-165, 202-202, 268-270, 376-376, 382-382

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ecce37 and 8be49d8.

📒 Files selected for processing (6)
  • flashinfer_bench/compile/builders/__init__.py (1 hunks)
  • flashinfer_bench/compile/builders/tvm_ffi_builder.py (3 hunks)
  • flashinfer_bench/compile/registry.py (1 hunks)
  • flashinfer_bench/compile/runnable.py (1 hunks)
  • pyproject.toml (1 hunks)
  • tests/compile/test_tvm_ffi_builder.py (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
flashinfer_bench/compile/builders/__init__.py (1)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • TvmFfiBuilder (32-395)
flashinfer_bench/compile/registry.py (1)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • TvmFfiBuilder (32-395)
tests/compile/test_tvm_ffi_builder.py (2)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (1)
  • TvmFfiBuilder (32-395)
flashinfer_bench/compile/builder.py (2)
  • build (78-85)
  • BuildError (48-49)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (4)
flashinfer_bench/compile/builder.py (7)
  • Builder (52-95)
  • can_build (59-61)
  • _make_key (74-76)
  • create_pkg_name (33-45)
  • _make_closer (69-71)
  • BuildError (48-49)
  • build (78-85)
flashinfer_bench/data/solution.py (1)
  • SupportedLanguages (12-24)
flashinfer_bench/env.py (1)
  • get_fib_cache_path (46-57)
flashinfer_bench/compile/runnable.py (1)
  • TVMFFIRunnable (41-83)
🪛 Ruff (0.14.4)
flashinfer_bench/compile/runnable.py

66-66: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/tvm_ffi_builder.py

151-151: Avoid specifying long messages outside the exception class

(TRY003)


165-165: Avoid specifying long messages outside the exception class

(TRY003)


202-202: Avoid specifying long messages outside the exception class

(TRY003)


268-270: Avoid specifying long messages outside the exception class

(TRY003)


376-376: Avoid specifying long messages outside the exception class

(TRY003)


382-382: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
🔇 Additional comments (15)
pyproject.toml (1)

96-100: Pytest options and setuptools_scm fallback look good

Expanded addopts will improve test reporting and timing visibility, and fallback_version is a sensible guard when VCS metadata is missing. Just make sure ignoring 3rdparty is intentional for all test environments.

flashinfer_bench/compile/runnable.py (1)

62-68: Single-device validation and output device selection are appropriate

Deriving the device set from input tensors and rejecting mixed-device inputs enforces a clear contract, and allocating outputs on that same device (or CPU when no tensor inputs) keeps TVM-FFI runs consistent. If you ever need more complex routing (e.g., nested containers or explicit device kwargs), this block is a good central place to extend.

flashinfer_bench/compile/registry.py (1)

55-63: Registry wiring for TvmFfiBuilder is correct

The lazy import and instantiation order keep the existing builder priority while switching to TvmFfiBuilder. No behavioral regressions here.

tests/compile/test_tvm_ffi_builder.py (5)

10-24: Isolated cache fixture is a solid guard against cross-test interference

The autouse isolated_cache fixture cleanly forces FIB_CACHE_PATH into a per-test temp dir, which is exactly what you want when asserting caching and rebuild behavior.


31-90: Centralizing CPP and CUDA kernels into CPP_SOURCE / CUDA_SOURCE improves maintainability

Pulling the kernel sources into shared constants avoids duplication across tests and makes later tweaks to the TVM-FFI kernels much easier.


187-223: can_build tests clearly pin the builder language contract

The CUDA/non-CUDA tests make the can_build contract explicit and will catch regressions if the accepted SupportedLanguages set changes.


324-365: test_call_dest_cpu gives good coverage of destination-passing semantics

Calling call_dest with preallocated output tensors and checking that they’re filled as expected is a nice direct test of the TVM-FFI runnable’s destination-passing interface on CPU.


367-437: Error-path and subdirectory tests robustly cover edge cases

The tests for invalid entry symbols, missing CUDA/C++ sources, and sources placed in subdirectories collectively exercise important failure and path-handling behavior in TvmFfiBuilder. The regex match for “No CUDA or C\+\+ sources” aligns with the new _detect_language error message, and the subdirectory test should prevent regressions in _write_sources.

Also applies to: 439-477

flashinfer_bench/compile/builders/tvm_ffi_builder.py (7)

20-30: Language enum and extension mapping look solid

The CUDA/CPP extension lists and Language enum are clear, minimal, and leave room for future extensions without complicating the builder logic.


71-84: can_build guard matches current language model

Checking sol.spec.language == SupportedLanguages.CUDA aligns with the existing SupportedLanguages enum and keeps the guard simple; the docstring explains the CUDA/C++ mix well enough.


86-99: Cache key prefixing with builder-specific prefix is appropriate

Using _KEY_PREFIX + create_pkg_name(solution) is a good way to avoid collisions with other builders while still leveraging the content-based hash in create_pkg_name.


101-108: No-op closer is acceptable here

Given there are no explicit external resources managed by this builder instance, returning a no-op closer keeps the interface uniform without adding complexity.


111-124: Build path layout under FIB_CACHE_PATH is reasonable

Building under get_fib_cache_path() / "tvm_ffi" / key keeps artifacts isolated per-builder and per-solution and plays nicely with the global cache env var.


248-272: Entry symbol extraction and validation are clear

Validating the entry_point format and then taking the final component after :: keeps errors early and messages understandable; this is consistent with the “file.ext::symbol” contract.


273-316: Runnable wrapping with keyword adapter is well-aligned with Definition

Building arg_order from defn.inputs + defn.outputs and then adapting **kwargs into a positional call keeps the TVM-FFI module focused on raw tensors while the TVMFFIRunnable handles allocation and the public interface. The strict key lookup (raising on missing kwargs) is reasonable here.

Comment on lines +32 to 50
class TvmFfiBuilder(Builder):
"""Builder using TVM-FFI with automatic caching and supports multi-process and multi-threaded
compilation. The result is framework agnostic and supports DLPack interop with PyTorch, JAX,
etc.
Cache logic: If the builder is asked to build the same solution again, it will return the cached
result. If another builder is asking to build the same solution, as long as the build directory
exists, it will return the cached result.
The solution to compile should be written in destination-passing style, i.e. the function
should take the input tensors and the output tensors as arguments.
Examples
--------
>>> builder = TVMFFIBuilder()
>>> runnable = builder.build(definition, solution)
>>> output = runnable(x=input_tensor) # Allocates and returns output
>>> runnable.call_dest(x=input_tensor, output=output_tensor) # Destination-passing style
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstrings still reference TVMFFIBuilder instead of TvmFfiBuilder

The class docstring example and __init__ doc still use the old class name, which can confuse users.

Suggested fix:

-    >>> builder = TVMFFIBuilder()
+    >>> builder = TvmFfiBuilder()
-        """Initialize the TVMFFIBuilder.
+        """Initialize the TvmFfiBuilder.

Also applies to: 61-66

🤖 Prompt for AI Agents
In flashinfer_bench/compile/builders/tvm_ffi_builder.py around lines 32 to 50
(and also apply the same change at lines 61 to 66), the class docstring and
examples still reference the old class name "TVMFFIBuilder"; update all
occurrences to the correct class name "TvmFfiBuilder" (including constructor
docs, examples, and any inline references) so the docstring matches the actual
class name and avoids confusing users.

Comment on lines 126 to 171
def _can_use_cached(self, path: Path, key: str, sol: Solution) -> bool:
"""Check if cached .so can be used by comparing source files and .so existence.
Returns True (can use cached .so) only if:
1. The compiled .so file exists
2. All source files exist with identical content
Parameters
----------
path : Path
Build directory path
key : str
Unique key for this solution (used to find .so file)
sol : Solution
Solution containing source files
Returns
-------
bool
True if cached .so can be used, False if compilation is needed
"""
# Check if build directory exists
if not path.exists():
return False
elif not path.is_dir():
raise BuildError(f"Build directory exists but is not a directory: {path}")

# Check if .so exists
so_path = path / f"{key}.so"
if not so_path.is_file():
return False

# Check if all files exist and content is identical
for src in sol.sources:
src_path = path / src.path

if not src_path.exists():
return False
elif not src_path.is_file():
raise BuildError(f"Source path exists but is not a file: {src_path}")

if src_path.read_text() != src.content:
return False

# All checks passed: can use cached .so
return True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Constrain src.path to avoid writing outside the cache directory

Both _can_use_cached and _write_sources use:

src_path = path / src.path

If src.path can contain absolute paths or .. segments (even accidentally), this can:

  • Write files outside build_path (potentially clobbering unrelated files).
  • Break the assumption that the build directory is self-contained for caching.
  • Make _can_use_cached compare against the wrong on-disk file.

Even if Solution.sources is “trusted” today, hardening this path handling will prevent subtle bugs and future security issues.

Suggested hardening (schematic):

-from pathlib import Path
+from pathlib import Path
+from typing import cast

 ...

-        for src in sol.sources:
-            src_path = path / src.path
+        for src in sol.sources:
+            raw = Path(src.path)
+            if raw.is_absolute():
+                raise BuildError(f"Source path must be relative, got absolute: {raw}")
+
+            # Normalise and ensure we stay within build_path
+            src_path = (path / raw).resolve()
+            try:
+                src_path.relative_to(path)
+            except ValueError:
+                raise BuildError(f"Source path escapes build dir: {raw}")

Apply the same raw = Path(src.path) / resolve() / relative_to(path) pattern consistently in _can_use_cached when reconstructing src_path so that cache validation operates on the same safe, normalised paths.

Also applies to: 206-246

🧰 Tools
🪛 Ruff (0.14.4)

151-151: Avoid specifying long messages outside the exception class

(TRY003)


165-165: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In flashinfer_bench/compile/builders/tvm_ffi_builder.py around lines 126-171,
_can_use_cached currently constructs src_path with path / src.path which allows
absolute paths or '..' to escape the build dir; change to: resolve the build dir
once (build_root = path.resolve()), convert src.path to a Path (raw =
Path(src.path)), join and resolve to get src_path = (build_root /
raw).resolve(), then ensure src_path is inside the build_root by calling
src_path.relative_to(build_root) and raise BuildError if it fails; apply the
identical pattern in _write_sources (lines ~206-246) so both validation and
writing use the same safe, normalized paths and prevent writes/reads outside the
cache directory.

Signed-off-by: Ubospica <ubospica@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (4)

48-48: Update example to use new class name.

The docstring example still references the old class name TVMFFIBuilder.

Apply this diff:

-    >>> builder = TVMFFIBuilder()
+    >>> builder = TvmFfiBuilder()

64-64: Update docstring to reference new class name.

The __init__ docstring still references the old class name TVMFFIBuilder.

Apply this diff:

-        """Initialize the TVMFFIBuilder.
+        """Initialize the TvmFfiBuilder.

241-243: Replace assertions with explicit validation.

Using assert for security checks is risky because assertions can be disabled with Python's -O optimization flag. Use explicit if statements with BuildError exceptions instead.

Apply this diff:

-            # Defensive assertion: path should be validated at Solution creation time
+            # Defensive check: path should be validated at Solution creation time
             src_path_obj = Path(src.path)
-            assert not src_path_obj.is_absolute(), f"Absolute path detected: {src.path}"
-            assert ".." not in src_path_obj.parts, f"Path traversal detected: {src.path}"
+            if src_path_obj.is_absolute():
+                raise BuildError(f"Absolute path detected: {src.path}")
+            if ".." in src_path_obj.parts:
+                raise BuildError(f"Path traversal detected: {src.path}")

165-167: Replace assertions with explicit validation.

Using assert for security checks is risky because assertions can be disabled with Python's -O optimization flag, potentially allowing malicious paths through. While the Solution model validator already performs these checks, defense-in-depth should use explicit if statements with exceptions.

Apply this diff:

-            # Defensive assertion: the path in the solution should be validated by the Solution
-            # model validator, but we add this defensive assertion to be safe.
+            # Defensive check: the path in the solution should be validated by the Solution
+            # model validator, but we add this defensive check to be safe.
             src_path_obj = Path(src.path)
-            assert not src_path_obj.is_absolute(), f"Absolute path detected: {src.path}"
-            assert ".." not in src_path_obj.parts, f"Path traversal detected: {src.path}"
+            if src_path_obj.is_absolute():
+                raise BuildError(f"Absolute path detected: {src.path}")
+            if ".." in src_path_obj.parts:
+                raise BuildError(f"Path traversal detected: {src.path}")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8be49d8 and 4d4b94f.

📒 Files selected for processing (3)
  • flashinfer_bench/compile/builders/tvm_ffi_builder.py (3 hunks)
  • flashinfer_bench/data/solution.py (2 hunks)
  • tests/data/test_solution.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/data/test_solution.py (1)
flashinfer_bench/data/solution.py (4)
  • BuildSpec (61-92)
  • SupportedLanguages (13-25)
  • Solution (95-171)
  • SourceFile (28-58)
flashinfer_bench/compile/builders/tvm_ffi_builder.py (4)
flashinfer_bench/compile/builder.py (7)
  • Builder (52-95)
  • can_build (59-61)
  • _make_key (74-76)
  • create_pkg_name (33-45)
  • _make_closer (69-71)
  • BuildError (48-49)
  • build (78-85)
flashinfer_bench/data/solution.py (2)
  • Solution (95-171)
  • SupportedLanguages (13-25)
flashinfer_bench/env.py (1)
  • get_fib_cache_path (46-57)
flashinfer_bench/compile/runnable.py (1)
  • TVMFFIRunnable (41-83)
🪛 Ruff (0.14.4)
tests/data/test_solution.py

99-99: Probable insecure usage of temporary file or directory: "/tmp/kernel.cpp::add_one_cpu"

(S108)


108-108: Probable insecure usage of temporary file or directory: "/tmp/kernel.cpp"

(S108)

flashinfer_bench/data/solution.py

132-132: Avoid specifying long messages outside the exception class

(TRY003)


134-134: Avoid specifying long messages outside the exception class

(TRY003)

flashinfer_bench/compile/builders/tvm_ffi_builder.py

154-154: Avoid specifying long messages outside the exception class

(TRY003)


174-174: Avoid specifying long messages outside the exception class

(TRY003)


211-211: Avoid specifying long messages outside the exception class

(TRY003)


282-284: Avoid specifying long messages outside the exception class

(TRY003)


390-390: Avoid specifying long messages outside the exception class

(TRY003)


396-396: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.10
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.12
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.11
  • GitHub Check: Run unit tests on ubuntu-latest and Python 3.13
🔇 Additional comments (5)
flashinfer_bench/data/solution.py (1)

129-134: Excellent security validation!

The path traversal and absolute path checks effectively prevent security vulnerabilities. These checks are performed at Solution creation time, ensuring that malicious paths are rejected before they reach the build system.

tests/data/test_solution.py (1)

76-109: Comprehensive security test coverage!

These tests properly validate that the path traversal and absolute path protections work as expected. Using pytest.raises with the match parameter ensures the correct error messages are raised.

flashinfer_bench/compile/builders/tvm_ffi_builder.py (3)

128-180: Well-designed cache validation logic!

The _check_sources method correctly validates the cache by checking both .so existence and source file content matching. The early return pattern makes the logic clear and efficient.


182-213: Clean language detection implementation.

The logic correctly identifies CUDA vs CPP based on file extensions and provides a clear error when no valid sources are found.


366-390: Excellent multi-process safe caching implementation!

The double-checked locking pattern with FileLock ensures thread-safe and process-safe builds. The cache-hit path avoids unnecessary file operations, and the lock-protected rebuild path prevents race conditions.


def _check_sources(self, path: Path, key: str, sol: Solution) -> bool:
"""Check if the source code is vaild, and if the cached .so can be used by comparing source
files and .so existence.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in docstring.

"vaild" should be "valid".

Apply this diff:

-        """Check if the source code is vaild, and if the cached .so can be used by comparing source
+        """Check if the source code is valid, and if the cached .so can be used by comparing source
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
files and .so existence.
"""Check if the source code is valid, and if the cached .so can be used by comparing source
files and .so existence.
🤖 Prompt for AI Agents
In flashinfer_bench/compile/builders/tvm_ffi_builder.py around line 130, the
docstring contains a typo: replace the word "vaild" with the correct spelling
"valid" so the sentence reads properly; update the docstring text accordingly
and run a quick lint/check to ensure no other typos remain.

seen_paths = set()
for source in self.sources:
# Security check: prevent path traversal attacks
src_path = Path(source.path)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in docstring.

"vaild" should be "valid".

Apply this diff:

-    def _check_sources(self, path: Path, key: str, sol: Solution) -> bool:
-        """Check if the source code is vaild, and if the cached .so can be used by comparing source
+        """Check if the source code is valid, and if the cached .so can be used by comparing source

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In flashinfer_bench/data/solution.py around line 130, the docstring contains a
typo "vaild" — update the spelling to "valid" in the docstring so the word is
correct and consistent with documentation standards.

Signed-off-by: Ubospica <ubospica@gmail.com>
Signed-off-by: Ubospica <ubospica@gmail.com>
Signed-off-by: Ubospica <ubospica@gmail.com>
@Ubospica
Copy link
Collaborator Author

cc @YiyanZhai

@Ubospica Ubospica changed the title Enhance TvmFfiBuilder refactor: Enhance TvmFfiBuilder Nov 17, 2025
@Ubospica Ubospica merged commit e9eee63 into flashinfer-ai:main Nov 17, 2025
8 checks passed
@Ubospica
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the TvmFfiBuilder by renaming it for consistency, making it thread and process-safe with file-based locking, and refactoring its structure for better clarity and maintainability. The addition of comprehensive docstrings, improved caching logic, and robust security checks for source file paths are excellent improvements. The test suite has also been substantially upgraded to cover these new features, including multi-process caching scenarios and security validations. I have a few minor suggestions for further improvement.

Comment on lines +48 to +49
>>> builder = TVMFFIBuilder()
>>> runnable = builder.build(definition, solution)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The example in the docstring uses the old class name TVMFFIBuilder. This should be updated to TvmFfiBuilder to match the refactoring in this PR.

Suggested change
>>> builder = TVMFFIBuilder()
>>> runnable = builder.build(definition, solution)
>>> builder = TvmFfiBuilder()
>>> runnable = builder.build(definition, solution)

Comment on lines 42 to 58
@model_validator(mode="after")
def _validate_python_syntax(self) -> "SourceFile":
"""Validate Python syntax for .py files.
def _validate_source_path(self) -> "SourceFile":
"""Validate source path for security.
Raises
------
ValueError
If the file is a Python file and contains invalid syntax.
If the path contains security issues (absolute paths or path traversal).
"""
if self.path.endswith(".py"):
try:
ast.parse(self.content, mode="exec")
except SyntaxError as e:
raise ValueError(f"SourceFile content must be valid Python code: {e}") from e

# TODO(shanli): syntax validation for other languages
src_path = Path(self.path)
if src_path.is_absolute():
raise ValueError(f"Invalid source path (absolute path not allowed): {self.path}")
if ".." in src_path.parts:
raise ValueError(
f"Invalid source path (parent directory traversal not allowed): {self.path}"
)
return self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This change removes the validation of Python syntax for .py files, which was previously handled by _validate_python_syntax. While the new path validation is an important security enhancement, removing the syntax check is a slight regression in data integrity, as it no longer prevents syntactically incorrect Python code from being included in a Solution. Was this removal intentional? If so, it would be good to understand the reasoning. Perhaps both validations could be combined.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation of Python file should be handled by the builder/compiler, instead of by the schema validation side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant