Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] MUSA: enable fastfp16, correct warp reduce impl and perf tuning #12383

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

BodhiHu
Copy link
Contributor

@BodhiHu BodhiHu commented Mar 14, 2025

Make sure to read the contributing guidelines before submitting a PR

[WIP] This PR will:

  1. enable fastfp16;
  2. correct warp reduce impl;
  3. perf tuning ...

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Mar 14, 2025
@BodhiHu BodhiHu marked this pull request as draft March 14, 2025 09:38
@BodhiHu BodhiHu changed the title MUSA: enable fastfp16 and correct warp reduce impl [WIP] MUSA: enable fastfp16 and correct warp reduce impl Mar 14, 2025
@BodhiHu BodhiHu changed the title [WIP] MUSA: enable fastfp16 and correct warp reduce impl [WIP] MUSA: enable fastfp16 and correct warp reduce impl and perf tuning Mar 14, 2025
@BodhiHu BodhiHu changed the title [WIP] MUSA: enable fastfp16 and correct warp reduce impl and perf tuning [WIP] MUSA: enable fastfp16, correct warp reduce impl and perf tuning Mar 14, 2025
@JohnLoveJoy
Copy link

The auto-label system is bugged, accidentally tagging everything with "Nvidia GPU."

@zhouwg
Copy link
Contributor

zhouwg commented Mar 15, 2025

how to requested a review from a specified expert in a specified PR? thanks so much!

@JohannesGaessler
Copy link
Collaborator

I can review the code in terms of the effects for the CUDA backend but I don't know anything about MUSA and do not have any MT hardware so I would need someone else to review this PR in terms of whether the changes are correct for MUSA.

@IMbackK
Copy link
Collaborator

IMbackK commented Mar 17, 2025

Same here but for effects on the hip backed.

One thing code style wise i can say is that this pr currently contains alot of stuff that dose not belong like commented out logging printfs an the like.

Otherwise it looks reasonable but i have no way of verifying correctness for MUSA

@BodhiHu
Copy link
Contributor Author

BodhiHu commented Mar 17, 2025

Hi, thanks for the comments here.
It's still a draft PR WIP, and not ready for review yet.

Sorry if it contains lots of debug prints now, which will be cleared when the perf tuning is done ;D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants