Introduce Qconv2d #91

dacorvo · 2024-02-20T11:04:43Z

This layer has a similar behaviour as QLinear. There is no accelerated quantized operations for convolutions, so its main use is to reduce on-device memory.

For now, the aten.convolution operation is not dispatched for QTensor, so it always falls back to a float operation on dequantized inputs and outputs.

dacorvo added 4 commits February 20, 2024 12:01

refactor(QModuleMixin): share code and allow float8 weights

4518da6

test(qlinear): simplify gradient test

d5e0a22

feat(nn): added QConv2d

fd3df92

For now, the aten.convolution operation is not dispatched for QTensor, so it always falls back to a float operation on dequantized inputs and outputs.

doc: update README

9b0b176

dacorvo merged commit cdff8a6 into main Feb 20, 2024
3 checks passed

dacorvo deleted the qconv2d branch February 20, 2024 11:10

dacorvo mentioned this pull request Feb 20, 2024

Add support for quantized Conv2d #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Qconv2d #91

Introduce Qconv2d #91

dacorvo commented Feb 20, 2024 •

edited

Loading

Introduce Qconv2d #91

Introduce Qconv2d #91

Conversation

dacorvo commented Feb 20, 2024 • edited Loading

dacorvo commented Feb 20, 2024 •

edited

Loading