-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend quantiser support so as to accelerate more binary models. #668
Conversation
edfd15a
to
6f85162
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Just a note to ourselves: when merging this into our private fork it will probably not cause any automatic merge conflicts, but we'll need to update the micro quantization kernels (should be fairly trivial, just by copying the changes from the public quantization kernel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done!
It took me a while to wrap my head around boolean input for lq.quantize
, but I think it makes a lot of sense and is much easier to maintain than passing an additional threshold attribute to the op or doing something similar.
I just have some additional comments regarding broadcasting and the matching of tf.where
.
2e18538
to
962a202
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you so much for figuring this out!
I just have a few questions to understand the code, other than that this looks great to me 🚀
larq_compute_engine/mlir/transforms/optimize_patterns_common.td
Outdated
Show resolved
Hide resolved
962a202
to
bd04370
Compare
Add the ability to convert `tf.where`-style binary quantisers, and add support for boolean input to `LceQuantize` and `LceDequantize`.
bd04370
to
0f0db5f
Compare
What do these changes do?
This PR extends the LCE converter with patterns that support converting
tf.where
-style binary quantisers. It also adds support for binary input/output ofLceQuantize
/LceDequantize
.Note that in larq/larq#677 we are moving towards having the Larq quantisers implemented with
tf.where
instead of the currenttf.sign
implementation. The main change is that this PR adds support for convertingtf.where
-style quantisers.The second change to add support for boolean input, meaning that a wider variety of binary quantisers will be accelerated by LCE. For example, the following wacky quantiser will now convert successfully:
With these changes any binary quantiser that can be implemented with
tf.where(boolean_condition, 1, -1)
can be converted into anLceQuantize
op (and consequently, a subsequent convolution can be converted to aBConv2D
and thus accelerated). Theboolean_condition
will be implemented with TFL ops, as in the example above, but since the quantisation in general is so quick compared to the binary convolution this doesn't present much of a performance issue.How Has This Been Tested?
MLIR test cases have been added; the end2end tests have been extended.
Benchmark Results
N/A.
Related issue number
larq/larq#677