Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLP & QLoRA Fused Ops and Kernels, Mixtral #29

Merged
merged 12 commits into from
Jun 2, 2024

Conversation

fabianlim
Copy link
Contributor

@fabianlim fabianlim commented May 30, 2024

Completing more items in #25 .

  • decided to remove the L40 benchmarks.

Verified that we can reproduce the roughly 20% speedups using fused-ops and kernels

  • these are per device throughputs, so for two gpus we should multiply by 2 to get the actual througput
    image

Verified that we are reproduce the 75% in memory reduction using 4bit base weights

  • also with FSDP when using two gpus, we see another 50% memory reduction
    image

Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
@fabianlim fabianlim requested a review from achew010 May 30, 2024 08:45
@fabianlim fabianlim self-assigned this May 30, 2024
@fabianlim fabianlim changed the title Add MLP Fused Ops and Kernels, Mixtral Add MLP Fused Ops and Kernels, Mixtral, QLoRA Kernels May 30, 2024
@fabianlim fabianlim changed the title Add MLP Fused Ops and Kernels, Mixtral, QLoRA Kernels Add MLP & QLoRA Fused Ops and Kernels, Mixtral May 30, 2024
@fabianlim fabianlim force-pushed the fix-foak-final branch 2 times, most recently from 2617d8c to fa50cf2 Compare May 30, 2024 11:50
Signed-off-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
@fabianlim
Copy link
Contributor Author

running a set of benches now. will merge after complete

@fabianlim fabianlim merged commit 8103238 into foundation-model-stack:dev Jun 2, 2024
4 checks passed
@fabianlim
Copy link
Contributor Author

@achew010 pls update if you have obtained the new benches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants