[DO NOT LAND] Prototype Helion kernel in vLLM #29051

gmagogsfm · 2025-11-20T00:35:06Z

This prorotype implements a naive silu_mul_fp8 kernel and integrates it in vLLM's custom fusion pass in the form of a custom op
Numerical accuracy is verified
There is on average about 4x slow down compared to vLLM's custom silu_mul_fp8 CUDA kernel

- This prorotype implements a naive silu_mul_fp8 kernel and integrates it in vLLM's custom fusion pass in the form of a custom op - Numerical accuracy is verified - There is on average about 4x slow down compared to vLLM's custom silu_mul_fp8 CUDA kernel Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[DO NOT LAND] Prototype Helion kernel in vLLM #29051

[DO NOT LAND] Prototype Helion kernel in vLLM #29051

gmagogsfm commented Nov 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

[DO NOT LAND] Prototype Helion kernel in vLLM #29051

Are you sure you want to change the base?

[DO NOT LAND] Prototype Helion kernel in vLLM #29051

Conversation

gmagogsfm commented Nov 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gmagogsfm commented Nov 20, 2025 •

edited by github-actions bot

Loading