You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I modified the sparse attribute in FasterTransformer/examples/cpp/multi_gpu_gpt. I was under the impression that it would accelerate the inference speed. But it turns out to be slower than the dense alternative regardless of the batch size. It also consumes much more memory than the dense option, which is counterintuitive. Is there any explanation on such behavior? Thanks
The text was updated successfully, but these errors were encountered:
I modified the
sparse
attribute in FasterTransformer/examples/cpp/multi_gpu_gpt. I was under the impression that it would accelerate the inference speed. But it turns out to be slower than the dense alternative regardless of the batch size. It also consumes much more memory than the dense option, which is counterintuitive. Is there any explanation on such behavior? ThanksThe text was updated successfully, but these errors were encountered: