Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support FP8 grouped GEMM with cudagraph #3373

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Nov 14, 2024

  1. Refactor FP8 grouped GEMM to prepare cudagraph support (pytorch#3369)

    Summary:
    
    X-link: facebookresearch/FBGEMM#460
    
    Refactor FP8 grouped GEMM to extract grouped GEMM arguments and configurations ahead of the grouped gemm kernel, such that those can be reused for another cuda kernel argument setup on device
    
    Differential Revision: D65548954
    jiawenliu64 authored and facebook-github-bot committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    9b4b04b View commit details
    Browse the repository at this point in the history
  2. Support FP8 grouped GEMM with cudagraph (pytorch#3373)

    Summary:
    
    X-link: facebookresearch/FBGEMM#463
    
    Enable cudagraph support for FP8 grouped GEMM
    
    It's quite challenging to make cudagraph support to handle more complicated kernel arguments with various pointer array and memory alignment, compared to cudagraph support in CK grouped GEMM in D65634843
    
    Differential Revision: D65864972
    jiawenliu64 authored and facebook-github-bot committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    1c3720a View commit details
    Browse the repository at this point in the history