-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
argMax/Min only support scalar axis in TFLite runtime #629
Comments
Looking at other backends and frameworks - tensorflow, pytorch, coreml, onnx all only support scalar axis. Only directml supports providing multiple axes, from the documentation it seems like if all axes are provided, it reduces to a single value, when it does that, the indices are calculated as if it's a flat 1d array. I don't know what happens if you give it multiple but not all axes, e.g. for a 4D array, give axes=[1,3] how does it determines the indices across dimensions if they were to be reduced together? In general it seems strange to get a min/max index across dimensions(if that's the expected behavior). Should we consider only supporting scalar axis? If we keep the support for axes array, we should clarify the behavior of how reduction is done when multiple dimensions are specified. |
Since ONNX also only supports a scalar axis I wouldn't be surprised if DML's multiple-axis feature is not actually used at all by WebNN currently. This is similar to the problem @fdwr explored in #645 (comment) - DML supports In general, I don't think we should be plumbing through DML-specific quirks to WebNN. I'm in favor of aligning with the other frameworks and only supporting scalar axis. |
The issues you linked to above are about |
@fdwr @huningxin gentle ping on this, I still propose to support only scalar axis. Further, the current fallback behavior when axes is not provided is :
If we change to scalar WDYT? |
SGTM!
Pytorch reduces all if
But I guess this behavior can be emulated by reshaping the tensor to 1-D and reducing along axis 0. So, +1 to default it to |
Sorry for the delay Phillis. Regardless of our final decisions here, I want to first explain how multidimensional min/max index reduction works, and also rationalize (when you look more holistically across the related functions) why ND argMin/argMax is a ND argMin/argMax examples to illustrate indexing
Here are some reduction indexing examples: Adjacent axes, 2D within 4D: input.shape = [2,3,4,5]
# omitting input values since they would be numerous
reduction axes = [1,2]
output.shape = [2,1,1,5] # keeping dimensions
element indices of the [_,3,4,_] subset:
[
[0,1,2,3],
[4,5,6,7],
[8,9,10,11],
] Non-adjacent axes, 2D within 4D: input.shape = [2,3,4,5]
reduction axes = [1,3]
output.shape = [2,1,4,1]
element indices of the [_,3,_,5] subset:
[
[0,1,2,3,4],
[5,6,7,8,9],
[10,11,12,13,14],
] Single axis in 4D: input.shape = [2,3,4,5]
reduction axes = [3]
output.shape = [2,3,4,1]
element indices of the [_,_,_,5] subset:
[0,1,2,3,4] All axes reduced 4D: input.shape = [2,3,4,5]
reduction axes = [0,1,2,3]
output.shape = [1,1,1,1]
element indices of the [2,3,4,5] subset:
[[[[0,1,2,3,4],[5,6,7,8,9]... [...119]]]] No axes reduced (nop): input.shape = [2,3,4,5]
reduction axes = [] # explicitly present but empty
output.shape = [2,3,4,5]
element indices of the [_,_,_,_] subset:
0 # a single scalar, effectively identity and same behavior is reduceMin with axes=[] So it's as if you took a slice from the input using the reduced axes, then ordered element indices linearly. Notice if you transpose those previously non-adjacent axes from above toward the back, you still get the same element indices within that reduced slice. e.g.: input.shape = [2,4,3,5] # Transposed from [2,3,4,5] above
reduction axes = [2,3]
output.shape = [2,4,1,1]
element indices of the [_,_3,5] subset:
[
[0,1,2,3,4],
[5,6,7,8,9],
[10,11,12,13,14],
] And if you flatten those last two axes to a single axis, you still get the same indices: input.shape = [2,4,15]
reduction axes = [2]
output shape = [2,4,1]
indices of the [_,_15] subset:
[0,1,2,3,4, 5,6,7,8,9, 10,11,12,13,14] Functional equivalence of capabilityThe general argMin/argMax ND is the superset form, and the single-axis form is a simpler subset of it, but conversely, a single-axis parameterization has equivalent capability to the ND form when combined with a transpose/reshape, and thus the ND form skips extra steps (like squashing the width and height or other spatial dimensions, which the caller would otherwise have to explicitly do) by operating on the tensor directly. So, backends like TF and CoreML are fully capable of implementing WebNN's current form. e.g. pseudocode: func ImplementNdArgMinViaSingleAxisBackend(input, axes)
{
if (axes.size == 1) { // reduce single axis - just ferry axis along
reducedInput = backend.argmin(input, axis=axes[0])
} else if (axes not defined) { // reduce all
reshapedInput = backend.reshape(input, [input.size]);
argMinResult = backend.argmin(reshapedInput, axis=0);
reducedInput = backend.reshape(argMinResult, input.shape)
} else { // axes.size > 1
permutation = MakePermutationThatPutsAxesAtBack(input.rank, axes);
reversePermutation = MakePermutationThatRestoresAxes(input.rank, axes);
flattenedShape = MakeFlattenedShape(input.shape, axes);
unflattenedShape = MakeUnflattenedShape(input.shape, axes);
reshapedInput = backend.reshape(backend.transpose(input, permutation), flattenedShape);
argMinResult = backend.argmin(reshapedInput, axis=[flattenedShape.size]);
reducedInput = backend.transpose(backend.reshape(argMinResult, unflattenedShape), reversePermutation);
}
return reducedInput;
} Front end/backend/ impactFront-end callers have no challenge with the current ND design, because if coming from a single axis world, it's simply:
Single-axis backend implementations have more of a challenge, but it's certainly possible (see Default axes when absent
Yep, NumPy too. When undefined, they act as if all axes were listed (e.g. for 4D, axes = [0,1,2,3]).
🤔 That would be inconsistent from the other reduction functions like reduceMin/reduceMax, and intra-API consistency has value too; but outside WebNN, there is no clearcut winner for omitted axes. Defaults include:
In cases of such ambiguity, rather than favoring one, maybe we should require an explicit value. WebNN should generally be explicit anyway. ReferencesNumPyimport numpy
x = numpy.array([[1,5,2],[3,4,6]], dtype=numpy.int64)
y = numpy.argmax(x, keepdims=True)
print("value:\n", y, sep="")
print("shape:", y.shape)
print("dtype:", y.dtype)
# value: [[5]]
# shape: (1, 1)
# dtype: int64 PyTorchimport torch
x = torch.tensor([[1,5,2],[3,4,6]], dtype=torch.float32)
y = tensorflow.argmax(x, keepdim=True)
print("value:", y)
print("shape:", y.shape)
print("dtype:", y.dtype)
# value: tensor([[5]])
# shape: torch.Size([1, 1])
# dtype: torch.int64 TensorFlowimport tensorflow
x = tensorflow.constant([[1,5,2],[3,4,6]], dtype=tensorflow.int64)
y = tensorflow.reshape(tensorflow.argmax(x), [1,3])
print("value:", y)
print("shape:", y.shape)
print("dtype:", y.dtype)
# value: tf.Tensor([[1 0 1]], shape=(1, 3), dtype=int64)
# shape: (1, 3)
# dtype: <dtype: 'int64'> ONNXhttps://onnx.ai/onnx/operators/onnx__ArgMin.html
CoreML
Effective pseudocode for each: PyTorch and NumPy pseudocodefunc argMinSingleAxisForPyTorchAndNumPy(input, axis)
if axis defined
argMinNd([axis])
else
axes = list(range(input.rank))
argMinNd(axes)
endif
endfunc TensorFlow and ONNX pseudocodefunc argMinSingleAxisForTensorFlowAndOnnx(input, axis)
if axis defined
argMinNd([axis])
else
argMinNd([0])
endif
endfunc CoreML pseudocodefunc argMinSingleAxisForCoreML(input, axis)
if axis defined
argMinNd([axis])
else
argMinNd([input.rank - 1])
endif
endfunc Final verdictAll that said, you'll see my original proposal had a single axis, and I even defaulted it to 0 too 😅... dictionary MLArgMinMaxOptions {
unsigned long axis = 0;
boolean keepDimensions = false;
boolean selectLastIndex = false; // selects the *last* index rather than the first find along axis
// NAMING: Maybe an enum for scanDirection or tieBreakerFirst/Last would be clearer?
}; ...and so I'm obviously not that opposed to single axis. Though, if we're now thinking of changing axes to an axis and diverging from the other reduction ops, then I'm thinking we should require an explicit axis due to the platform differences above rather than just defaulting to 0, which seems a weird default anyway (because why would you want to default to finding the min/max across batches?). Perusing a dozen models I have locally with argMax in them, I see 9 used axis=1, 2 used axis=-1, and 1 used axis=0. So if we were to pick a default, axis=1 might make more sense, but I'd still rather just be explicit at this API level. |
@fdwr Thanks! I understand that the multi axes scenario can be emulated. Thanks for providing the perspective of looking at it together with other reduction methods.
Yeah that's my perspective. If we don't have such use cases, it doesn't seem worth it to support it, given that both tflite/coreml need to emulate it. On the default value, given your reasoning, not having a default value sounds good to me. |
@fdwr , thanks for the details!
Requiring an explicit axis SGTM. |
Comparing with softmax...
...any preferences for:
vs
? I recall reading that required parameters are better as explicit parameters, rather than dictionary members, but I'm not finding that verbiage in the spec 🤔. |
@fdwr , IIUC,
|
Once the
And the scalar input should also be unsupported, because it would fail CoreML doesn't support scalar input (refer to @philloooo 's CL). |
The axis is a scalar constraint in arg_min_max::Prepare() function here third_party/tflite/src/tensorflow/lite/kernels/arg_min_max.cc.
The text was updated successfully, but these errors were encountered: