Should `axes` be required parameter for layerNormalization build method? #487

huningxin · 2023-12-01T04:12:23Z

(This was raised by @wacky6 at Chromium CL-5068275 review)

In proposal Add support for operations needed for well-known transformers , the axes of layerNormalization operator is defined as an optional member of MLLayerNormalizationOptions dictionary with default value [1, 2, 3].

axes, of type sequence
A sequence of unsigned long. The indices to the input dimensions to reduce. When this member is not present, it is assumed to be [1,2,3] that is, the reduction for the mean and variance values are calculated across all the input features for each individual sample in the batch.

@wacky6 mentioned TensorFlow's layerNormalization defaults to the last dimension (axis=-1) while PyTorch's LayerNorm requires the normalized_shape to be passed.

In terms of API design, seems axes are required for this op to function. Should this be moved outside of options?

layerNormalization(sequence<[EnforceRange] unsigned long> axes, options)

/cc @wchao1115 @fdwr

The text was updated successfully, but these errors were encountered:

fdwr · 2023-12-02T01:21:09Z

Ningxin: Different ML libraries having different defaults for axes is a little concerning, as it's tough to favor one default over the other here. There's even divergence between the original paper that introduced layerNorm (where the default includes all axes except the dimension 0, and so 4D yields axes = [1,2,3]) vs ONNX's/TF's default (defaults to only the last axis, so 4D yields axes = [3]). Given the layer in the hierarchy that WebNN targets (lower-level backend and less end-user facing), I generally favor being explicit over being implicit, and so I'd support requiring the axes attribute (not undefined). Frameworks will clearly know their own expected behavior.

Though, whether to pass axes as a function parameter vs inside the options seems an orthogonal discussion to whether axes should be required or not, because I see that resample also has axes in MLResample2dOptions, and they are a required attribute (not passing axes there yields an error). So given the resample precedent, we could keep axes inside the MLLayerNormalizationOptions, but still require it to be present.

inexorabletash mentioned this issue Feb 6, 2024

Process: Add documentation for labels, current and proposed #533

Merged

3 tasks

anssiko added the operator specific label Feb 7, 2024

inexorabletash mentioned this issue May 2, 2024

Meta: Introduce "Interop" label? #673

Closed

inexorabletash added the interop label May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should `axes` be required parameter for layerNormalization build method? #487

Should `axes` be required parameter for layerNormalization build method? #487

huningxin commented Dec 1, 2023

fdwr commented Dec 2, 2023 •

edited

Loading

Should axes be required parameter for layerNormalization build method? #487

Should axes be required parameter for layerNormalization build method? #487

Comments

huningxin commented Dec 1, 2023

fdwr commented Dec 2, 2023 • edited Loading

Should `axes` be required parameter for layerNormalization build method? #487

Should `axes` be required parameter for layerNormalization build method? #487

fdwr commented Dec 2, 2023 •

edited

Loading