OperandType of gemm / matmul return #86

kpu · 2020-08-27T09:48:22Z

The spec says gemm returns "an Operand" (and the same thing for matmul).

If both arguments are tensor-quant8-asymm, what is the OperandType of the return? I can see use cases for tensor-int32 which is how it will actually be generated by existing hardware, tensor-quant8-asymm for a fully quantized model, or even tensor-float32 for people that have only partly quantized their model.

This matters because the spec doesn't appear to have e.g. a requantization operator to convert int32 to int8 and anyway one would need the ability to set the scaling factor used by running the model in advance to measure an appropriate scaling factor.

The text was updated successfully, but these errors were encountered:

anssiko · 2020-08-27T13:32:58Z

Thanks for your comment. To ensure this detailed spec feedback is addressed appropriately, I've transferred the issue to the WebNN API specification repo where the API design work happens:
webmachinelearning/webnn#84

wchao1115 · 2020-09-08T05:50:49Z

@kpu this issue has previously been discussed in webmachinelearning/webnn#44. I will be refactoring quantization-related procedural data from the OperandDescriptor type as we incorporate aspects of quantization work into the operator API.

kpu · 2020-09-08T08:30:52Z

@wchao1115 The issue you referenced, webmachinelearning/webnn#44, is about how the quantization scaling factor and zeropoint should be included in OperandDescriptor.

As the title of this issue says, this is about the OperandType of the return value from matmul. Should multiplying int8 by int8 return float32, int32, or include a scaling factor to go to int8?

This has nothing to do with how the scaling factor is encoded in OperandDescriptor (and your suggestion that it not be).

wchao1115 · 2020-09-08T22:00:15Z

@kpu you are right that they are not the same issue. I only meant to point out that the issue around how to properly support quantization is not fully resolved, and that #44 is related to that whole conversation. I didn't mean to suggest that they are the same issue.

anssiko mentioned this issue Aug 27, 2020

OperandType of gemm / matmul return webmachinelearning/webnn#84

Closed

anssiko closed this as completed Aug 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OperandType of gemm / matmul return #86

OperandType of gemm / matmul return #86

kpu commented Aug 27, 2020

anssiko commented Aug 27, 2020

wchao1115 commented Sep 8, 2020

kpu commented Sep 8, 2020

wchao1115 commented Sep 8, 2020

OperandType of gemm / matmul return #86

OperandType of gemm / matmul return #86

Comments

kpu commented Aug 27, 2020

anssiko commented Aug 27, 2020

wchao1115 commented Sep 8, 2020

kpu commented Sep 8, 2020

wchao1115 commented Sep 8, 2020