You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The spec says gemm returns "an Operand" (and the same thing for matmul).
If both arguments are tensor-quant8-asymm, what is the OperandType of the return? I can see use cases for tensor-int32 which is how it will actually be generated by existing hardware, tensor-quant8-asymm for a fully quantized model, or even tensor-float32 for people that have only partly quantized their model.
This matters because the spec doesn't appear to have e.g. a requantization operator to convert int32 to int8 and anyway one would need the ability to set the scaling factor used by running the model in advance to measure an appropriate scaling factor.
The text was updated successfully, but these errors were encountered:
[This issue was originally posted at https://github.com/w3c/machine-learning-workshop/issues/86 ]
@kpu wrote:
The text was updated successfully, but these errors were encountered: