Skip to content

Question about "Bit-serial linear transformation" in paper #35

Answered by kaleid-liner
zjnyly asked this question in Q&A
Discussion options

You must be logged in to vote

Thanks for your interests in the project.

However, does this lead to significant precision loss? If so, is this loss introduced during the quantization of the LUT?

Yes, this loss is introduced during the quantization of the LUT. However, this loss is negligible. The baseline llama.cpp also introduces int8 activation quantization, and according to section 5.6 of our paper, the LUT quantization achieves exactly the same results compared to activation quantization.

Such fine-grained activation quantization won't lead to significant precision loss and is negligible compared to loss brought by weight quantization.

Would it be possible to avoid this precision loss by not quantizing the LUT?

S…

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@zjnyly
Comment options

@kaleid-liner
Comment options

Answer selected by zjnyly
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
question Further information is requested
2 participants
Converted from issue

This discussion was converted from issue #34 on August 30, 2024 08:02.