cuSPARSELt is slower? #767

BDHU · 2023-10-16T11:11:55Z

I modified the sparse attribute in FasterTransformer/examples/cpp/multi_gpu_gpt. I was under the impression that it would accelerate the inference speed. But it turns out to be slower than the dense alternative regardless of the batch size. It also consumes much more memory than the dense option, which is counterintuitive. Is there any explanation on such behavior? Thanks

The text was updated successfully, but these errors were encountered:

YixinSong-e · 2023-10-17T09:25:50Z

Hello, did you try flash-llm?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuSPARSELt is slower? #767

cuSPARSELt is slower? #767

BDHU commented Oct 16, 2023

YixinSong-e commented Oct 17, 2023

cuSPARSELt is slower? #767

cuSPARSELt is slower? #767

Comments

BDHU commented Oct 16, 2023

YixinSong-e commented Oct 17, 2023