Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should axes be required parameter for layerNormalization build method? #487

Open
huningxin opened this issue Dec 1, 2023 · 1 comment
Open

Comments

@huningxin
Copy link
Contributor

(This was raised by @wacky6 at Chromium CL-5068275 review)

In proposal Add support for operations needed for well-known transformers , the axes of layerNormalization operator is defined as an optional member of MLLayerNormalizationOptions dictionary with default value [1, 2, 3].

axes, of type sequence
A sequence of unsigned long. The indices to the input dimensions to reduce. When this member is not present, it is assumed to be [1,2,3] that is, the reduction for the mean and variance values are calculated across all the input features for each individual sample in the batch.

@wacky6 mentioned TensorFlow's layerNormalization defaults to the last dimension (axis=-1) while PyTorch's LayerNorm requires the normalized_shape to be passed.

In terms of API design, seems axes are required for this op to function. Should this be moved outside of options?

layerNormalization(sequence<[EnforceRange] unsigned long> axes, options)

/cc @wchao1115 @fdwr

@fdwr
Copy link
Collaborator

fdwr commented Dec 2, 2023

Ningxin: Different ML libraries having different defaults for axes is a little concerning, as it's tough to favor one default over the other here. There's even divergence between the original paper that introduced layerNorm (where the default includes all axes except the dimension 0, and so 4D yields axes = [1,2,3]) vs ONNX's/TF's default (defaults to only the last axis, so 4D yields axes = [3]). Given the layer in the hierarchy that WebNN targets (lower-level backend and less end-user facing), I generally favor being explicit over being implicit, and so I'd support requiring the axes attribute (not undefined). Frameworks will clearly know their own expected behavior.

Though, whether to pass axes as a function parameter vs inside the options seems an orthogonal discussion to whether axes should be required or not, because I see that resample also has axes in MLResample2dOptions, and they are a required attribute (not passing axes there yields an error). So given the resample precedent, we could keep axes inside the MLLayerNormalizationOptions, but still require it to be present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants