How to define the algorithm of L2_Pool2d? #278

mingmingtasd · 2022-07-01T05:07:09Z

As you know, the algorithm of L2_Pool2d is based on the
Lp-normalization function which should be Y = (X1^P + X2^P + ... + Xn^P) ^ (1/P).

But for L2_pool2d, I am not sure whether need to average the sum of elements as
Y =( (X1^2 + X2^2 + ... + Xn^2)/2) ^ (1/2) or directly use Lp-normalization function as
Y = (X1^2 + X2^2 + ... + Xn^2) ^ (1/2).

I find two papers: https://sci-hub.yncjkj.com/10.1109/cvpr.2011.5995370 and https://sci-hub.yncjkj.com/10.1109/tcsvt.2015.2461978, they describe the Lp-normalization as below:

So I confirm that Lp-normalization function should be Y = (X1^P + X2^P + ... + Xn^P) ^ (1/P), but I am still not sure whether need to averge the sum of elements for L2_pool2d. I go through some framwork API spec and find the description as below:

1. NNAPI ANEURALNETWORKS_L2_POOL_2D:
output[b, i, j, c] =
sqrt(sum_{di, dj} pow(input[b, strides[1] * i + di, strides[2] * j + dj, c], 2) /
sum(1))
2. ONNX LpPool:
LpPool consumes an input tensor X and applies Lp pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. Lp pooling consisting of computing the Lp norm on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing.
3. OpenVINO: Not Supported
4. DML DML_LP_POOLING_OPERATOR_DESC :
Computes the Lp-normalized value across the elements within the sliding window over the input tensor. The value of the P variable in the Lp-normalization function Y = (X1^P + X2^P + ... + Xn^P) ^ (1/P), where X1 to Xn representing each of the values within the sliding window. In common use cases, this value is either set to 1 or 2, representing either the L1 or L2 normalization respectively.

So it seems that NNAPI ANEURALNETWORKS_L2_POOL_2D should do averge. But after verifying on DML, DML DML_LP_POOLING_OPERATOR_DESC doesn't averge. Thus the algroithm and implementation for l2_pool2d in these frameworks may be different.

anssiko · 2022-09-01T08:17:59Z

@huningxin, please report your proposed approach in this issue for the WG to review when you've discussed this issue with @mingmingtasd.

fdwr · 2024-02-06T06:03:41Z

...I am not sure whether need to average the sum of elements...

@mingmingtasd : What sources are you seeing that average elements before the ^ (1/p)? Everything above appears consistent, including the papers, ONNX, DML, and NNAPI, all supporting Lebesgue Pooling for p = 2 as Y = (X1^2 + X2^2 + ... + Xn^2) ^ (1/2). The / sum(1) in NNAPI is odd but ignorable, as it doesn't affect the result.

mingmingtasd · 2024-02-06T06:52:43Z

...I am not sure whether need to average the sum of elements...

@mingmingtasd : What sources are you seeing that average elements before the ^ (1/p)? Everything above appears consistent, including the papers, ONNX, DML, and NNAPI, all supporting Lebesgue Pooling for p = 2 as Y = (X1^2 + X2^2 + ... + Xn^2) ^ (1/2). The / sum(1) in NNAPI is odd but ignorable.

Agree. The L2Pool2d should follow and be based on the Lp-normalization function(Y = (X1^P + X2^P + ... + Xn^P) ^ (1/P)).

fdwr · 2024-02-11T03:45:20Z

@mingmingtasd Is there anything remaining unresolved on this one, or can it be closed?

mingmingtasd · 2024-03-28T06:56:49Z

@mingmingtasd Is there anything remaining unresolved on this one, or can it be closed?

Let's close it, thanks.

fujunwei · 2024-06-06T08:32:02Z

TFLite also need to average the count of sum elements ((X1^2 + X2^2 + ... + Xn^2)/n) ^ (1/2), here is the l2_pool2d's kernel, so how to keep compatibility between TFLite models and ONNX models?

huningxin · 2024-08-08T14:46:35Z

@fujunwei

TFLite also need to average the count of sum elements ((X1^2 + X2^2 + ... + Xn^2)/n) ^ (1/2)

As discussed before, l2 normalization should be calculated by Y = (X1^2 + X2^2 + ... + Xn^2) ^ (1/2). Would this be an issue of TFLite implementation? @reillyeon

BTW, do we have implementation experience on CoreML's l2_pool? https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#coremltools.converters.mil.mil.ops.defs.iOS15.pool.l2_pool @philloooo

fdwr · 2024-08-13T07:40:02Z

Would this be an issue of TFLite implementation?

🤔 If TFLite defines it that way, that may be a useful operation, but it's something besides L2 pooling 👀.

function poolLebesgue(input, axes, windowDimensions, padding, strides, dilations, exponent)
    // y = (x1^p + x2^p + ... + xn^p) ^ (1/p)    // y is the reduced output for all applicable inputs

    return root(poolSum(pow(input, exponent), axes, windowDimensions, padding, dilations), exponent)
endfunction

a-sully · 2024-08-13T17:32:27Z

🤔 If TFLite defines it that way, that may be a useful operation, but it's something besides L2 pooling 👀.

Agreed. It seems like a TFLite WebNN backend will have to decompose l2pool2d... that begs the questions:

1. Is this decomposition expressible in WebNN?

Recall that we eventually need to clearly define all WebNN operators #462

return root(poolSum(pow(input, exponent), axes, windowDimensions, padding, dilations), exponent)

root (sqrt) and pow are existing WebNN operators 👍
poolSum is not. That being said, all of WebNN's pooling operators share a common pooling mechanism. Let's specify the decomposition of this pooling mechanism as a first step towards Improve "underlying platform" references in spec #462? (Also potentially related to Core operator set #573)

2. Is this decomposition expressible in TFLite?

Anything is possible - especially if a device type is not mandated #749 - but potentially with severe performance cliffs, especially for non-CPU backends (e.g.), and especially if the operator implementation has to be hand-rolled.

FWIW #689 has the same issue: if at least two backends (i.e. Core ML and DML) have consistent behavior, then maybe that's okay (at least once discrepancies like #180 are resolved!)

a-sully · 2024-08-13T17:44:21Z

especially if the operator implementation has to be hand-rolled

I just realized #180 (comment) provides a decomposition of l2pool2d using sqrt(conv2d(pow(input, 2), filterOfOnes, {dilations...})) - thanks @fwdr!

fdwr · 2024-08-14T08:57:34Z

poolSum is not [an existing WebNN operator]

@a-sully The more time I devote to thinking of expressing aggregate operators in terms of more fundamental operators, the more I realize some primitive ops (like poolSum) are missing. Even if not directly useful by themselves for ML, they can be useful for composition of others. At least for poolSum (see here), there's an easy decomposition using convolution:

function poolSum(input, axes, windowDimensions, padding, strides, dilations)
    return poolGeneric(input, axes, windowDimensions, padding, strides, dilations, add, 0)
    // OR  convolve(input, filter = ones(windowDimensions), axes, windowDimensions, padding, strides, dilations)
endfunction

Is this decomposition expressible in WebNN?

Yep.

Is this decomposition expressible in TFLite?

Yep.

if at least two backends (i.e. Core ML and DML) have consistent behavior, then maybe that's okay

Well, these {ONNX, DML, NNAPI, the original paper} agree (and I suspect CoreML too).

huningxin · 2024-08-14T09:25:30Z

@fdwr

(and I suspect CoreML too)

CoreML agrees. l2Pool2d WPT test cases pass on CoreML except for those with dilation and rounding:

https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/platform/mac-mac15-arm64/virtual/webnn-service-with-gpu/external/wpt/webnn/conformance_tests/pooling.https.any_gpu-expected.txt?q=l2pool2d%20mac&ss=chromium%2Fchromium%2Fsrc

huningxin · 2024-08-14T10:57:20Z

@fdwr

function poolSum(input, axes, windowDimensions, padding, strides, dilations)
    return poolGeneric(input, axes, windowDimensions, padding, strides, dilations, add, 0)
    // OR  convolve(input, filter = ones(windowDimensions), axes, windowDimensions, padding, strides, dilations)
endfunction

Thanks for the decomposition! It's very helpful. IIUC, we may need to set the convolution groups to input channels, and make the all-ones filter in shape [groups, 1, windowDimensions.height, windowDimensions.width].

a-sully · 2024-08-14T18:28:43Z

Well, these {ONNX, DML, NNAPI} agree (and I suspect CoreML too).

Not related to l2Pool2d, but some thoughts regarding operator compatibility across platforms:

I understand that ONNX and DML are different things, but in these discussions, I consider them to be ~one backend. They're cooperating layers of the same stack :)
NNAPI is being deprecated in Android 15. It should not factor into decisions about WebNN

fujunwei · 2024-08-15T00:30:31Z

I filed the issue in TensorFlow, they also consider it's an issue of TFLite kernel implementation, maybe they will fix it later.

fdwr · 2024-08-15T01:17:11Z

I filed the issue in TensorFlow, they also consider it's an issue of TFLite kernel implementation, maybe they will fix it later.

Junwei: Thanks for filing. So it appears WebNN's TFLite backend would need a decomposition until any future TFLite fix.

potentially with severe performance cliffs ... especially if the operator implementation has to be hand-rolled

Austin: If it's any perf consolation, LP pooling is evidently not so common (only a few models in my little stash of hundreds of 1411 model files).

huningxin · 2024-08-15T01:33:37Z

@a-sully

I understand that ONNX and DML are different things, but in these discussions, I consider them to be ~one backend. They're cooperating layers of the same stack :)

Regarding to our new op proposal check list, there are two aspects: Cross-framework support and Cross-platform implementability. I understand we usually study ONNX / ONNXRuntime as one example of frameworks, among TensorFlow and Pytorch etc., and investigate DML as one example of platform APIs, among TFLite and CoreML.

a-sully · 2024-08-15T16:51:25Z

Austin: If it's any perf consolation, LP pooling is evidently not so common (only a few nodes in my little stash of hundreds of 1411 model files).

If that's the case and there's a straightforward decomposition which could be performed in "userspace"... is this operator needed in WebNN at all?

Put another way, if this operator wasn't already in the WebNN spec, would it pass the new op proposal checklist?

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

fdwr · 2024-08-23T03:39:04Z

It seems like a TFLite WebNN backend will have to decompose l2pool2d..
...
copybara-service bot mentioned this issue 3 days ago

Woot, it appears that TFLite is already fixed (per https://github.com/tensorflow/tensorflow/pull/74079/files), which means it's just contingent on Chromium updating it's TF version. Given there will now be a direct call to TFLite with no decomposition, does that change the difficulty of implementing this?

is this operator needed in WebNN at all? ...
if this operator wasn't already in the WebNN spec, would it pass the new op proposal checklist?

I think it's still worth adding to the complete collective of pooling operations, and several implementations offer faster implementations than their decomposition, suggesting it's useful. Even if it's rare in my mini-model collection, I do see people asking questions about it on forums, indicating utility. Barring that one implementation bug (now fixed), the backends implement it consistently too (unlike potentially more dubious localResponseNormalization where multiple implementations have little complicating differences).

Use Case. What user scenarios or experiences will benefit from this operation, and why aren't existing operations sufficient?
Computer vision and pattern recognition.
Sample models. What are specific models that enable the target use case? One or more sample models as references are required.
VGG with L2 pooling in Geodesics of learned representations https://arxiv.org/abs/1511.06394
Cross-framework support. Is an identical or similar operation supported by multiple popular frameworks? What are they?
Yes.
Cross-platform implementability. Is the operation implementable in more than one platform? What are they?
Yes.

a-sully · 2024-08-23T18:15:02Z

Thanks for thoroughly following through with this issue @fdwr. TFLite's alignment with the other platforms does improve the "Cross-framework support" line item. Seems reasonable to me 👍 Filed https://crbug.com/361717758 to track implementation in Chromium

Can we close this issue?

fdwr · 2024-08-23T18:31:25Z

Mingming, I'm closing it from the spec perspective, as Austin created a Chromium issue for it. 👍 Thank you for raising it.

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

BruceDai mentioned this issue Feb 2, 2023

WPT tests tracker / 2 for remaining ops #338

Closed

anssiko added the question label Mar 3, 2023

fdwr mentioned this issue Apr 5, 2023

Implement more reduction webmachinelearning/webnn-baseline#44

Merged

inexorabletash mentioned this issue Feb 6, 2024

Process: Add documentation for labels, current and proposed #533

Merged

3 tasks

anssiko added the operator specific label Feb 7, 2024

mingmingtasd closed this as completed Mar 28, 2024

huningxin reopened this Jun 6, 2024

inexorabletash mentioned this issue Jul 11, 2024

Improve "underlying platform" references in spec #462

Open

huningxin mentioned this issue Aug 6, 2024

WebML WG Teleconference – 8 August 2024 - Open issues and PRs webmachinelearning/meetings#26

Open

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Aug 19, 2024

Fixes L2Pool implementation to not average pooling region squares

b7fdcb5

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot mentioned this issue Aug 19, 2024

Fixes L2Pool implementation to not average pooling region squares tensorflow/tensorflow#74079

Draft

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Aug 19, 2024

Fixes L2Pool implementation to not average pooling region squares

36766f5

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Aug 20, 2024

Fixes L2Pool implementation to not average pooling region squares

df25008

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

fdwr closed this as completed Aug 23, 2024

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Aug 23, 2024

Fixes L2Pool implementation to not average pooling region squares

1dc5371

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Sep 4, 2024

Fixes L2Pool implementation to not average pooling region squares

bca292e

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Sep 5, 2024

Fixes L2Pool implementation to not average pooling region squares

0a72bc1

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Sep 6, 2024

Fixes L2Pool implementation to not average pooling region squares

31e398e

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Sep 11, 2024

Fixes L2Pool implementation to not average pooling region squares

138a6b8

See discussion here: webmachinelearning/webnn#278 PiperOrigin-RevId: 664908697

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to define the algorithm of L2_Pool2d? #278

How to define the algorithm of L2_Pool2d? #278

mingmingtasd commented Jul 1, 2022

anssiko commented Sep 1, 2022

fdwr commented Feb 6, 2024 •

edited

Loading

mingmingtasd commented Feb 6, 2024 •

edited

Loading

fdwr commented Feb 11, 2024

mingmingtasd commented Mar 28, 2024

fujunwei commented Jun 6, 2024 •

edited

Loading

huningxin commented Aug 8, 2024

fdwr commented Aug 13, 2024 •

edited

Loading

a-sully commented Aug 13, 2024

a-sully commented Aug 13, 2024

fdwr commented Aug 14, 2024 •

edited

Loading

huningxin commented Aug 14, 2024

huningxin commented Aug 14, 2024

a-sully commented Aug 14, 2024

fujunwei commented Aug 15, 2024 •

edited

Loading

fdwr commented Aug 15, 2024 •

edited

Loading

huningxin commented Aug 15, 2024

a-sully commented Aug 15, 2024

fdwr commented Aug 23, 2024 •

edited

Loading

a-sully commented Aug 23, 2024

fdwr commented Aug 23, 2024

How to define the algorithm of L2_Pool2d? #278

How to define the algorithm of L2_Pool2d? #278

Comments

mingmingtasd commented Jul 1, 2022

anssiko commented Sep 1, 2022

fdwr commented Feb 6, 2024 • edited Loading

mingmingtasd commented Feb 6, 2024 • edited Loading

fdwr commented Feb 11, 2024

mingmingtasd commented Mar 28, 2024

fujunwei commented Jun 6, 2024 • edited Loading

huningxin commented Aug 8, 2024

fdwr commented Aug 13, 2024 • edited Loading

a-sully commented Aug 13, 2024

1. Is this decomposition expressible in WebNN?

2. Is this decomposition expressible in TFLite?

a-sully commented Aug 13, 2024

fdwr commented Aug 14, 2024 • edited Loading

huningxin commented Aug 14, 2024

huningxin commented Aug 14, 2024

a-sully commented Aug 14, 2024

fujunwei commented Aug 15, 2024 • edited Loading

fdwr commented Aug 15, 2024 • edited Loading

huningxin commented Aug 15, 2024

a-sully commented Aug 15, 2024

fdwr commented Aug 23, 2024 • edited Loading

a-sully commented Aug 23, 2024

fdwr commented Aug 23, 2024

fdwr commented Feb 6, 2024 •

edited

Loading

mingmingtasd commented Feb 6, 2024 •

edited

Loading

fujunwei commented Jun 6, 2024 •

edited

Loading

fdwr commented Aug 13, 2024 •

edited

Loading

fdwr commented Aug 14, 2024 •

edited

Loading

fujunwei commented Aug 15, 2024 •

edited

Loading

fdwr commented Aug 15, 2024 •

edited

Loading

fdwr commented Aug 23, 2024 •

edited

Loading