-
Notifications
You must be signed in to change notification settings - Fork 18
Max ULP distance for WebNN ops with pr #139 - ULP comparison #144
Comments
@huningxin PTAL, thanks. |
This is great input for this week's WG discussion on conformance testing. Thanks much @BruceDai . |
@BruceDai Can you please explain what do you mean by "max ULP distance" and how do you plan to use it? Are we looking to use the result from the WebNN-native on CPU as our baseline? |
Bruce is using the result of WebNN-polyfill CPU backend as the baseline. WebNN-poyfill CPU is based on TF.js CPU backend which uses JavaScript numbers to calculate kernels. I suppose the results should have double precision. /cc @pyu10055 Regarding to current WebNN-native CPU backends, say OpenVINO CPU, XNNPACK and oneDNN, I understand they are single precision and might not meet the baseline requirement. |
@wchao1115 The max ULP distance means the max one of those various distance results between actual output and baseline. Here's a sample of
There's a problem that max ULP distance would update with
What's the strategy for defining acceptable ULP distance? |
@BruceDai Your ULP values seem high. For reference, the @huningxin Are you sure that the baseline result here is from a pure double-precision compute on the CPU? If the baseline is indeed from a double precision result, then you will need to truncate it down to a single-precision value before comparing it with the single-precision result from the WebGL backend. The 2 inputs to the |
I'll be happy to add a |
I believe so, because AFAIK, the JavaScript performs double-precision arithmetic calculations. And tfjs-backend-cpu kernels' calculation is implemented in JavaScript.
I suppose this is also true, because the double-precision results are stored back to a |
We probably could compute the baseline by JavaScript along with the test cases (part of the w-p-t). For Bruce's const input = [0.33435354, 0.57139647, 0.03689031];
const exponent =30;
// Compute the double-precision baseline
const baseline = input.map(x => Math.pow(x, exponent));
// baseline = [5.323261130422279e-15, 5.1065382759817323e-8, 1.0171128528373136e-43]
// Truncate the double-precision baseline to single-precision
const baselineInFloat32 = new Float32Array(baseline)
// baselineInFloat32 = [5.323261142732008e-15, 5.106538125687621e-8, 1.0229478789571165e-43]
// Then do ULP comparison with results of WebNN pow This is an extremely simplified example. The baseline of other complex ops would require more efforts to implement the compute kernel. As a reference, the tf.js conv2d JS kernel is ~150 LOC. The efforts might be deserved, because this could help us establish a baseline that would meet the requirements raised in WebML WG Teleconference – 2 Dec 2021, like by @wchao1115
Any thoughts? |
this would match what I have in mind, indeed. FWIW, while maintaining it in WPT is an option, I think we don't need make this a requirement - at the end of the day, what is needed in WPT is only the results of the computation, not the computation code itself. In particular, given the amount of the WPT-specific infrastructure in that repo, we might be better served by a lighterweight dedicated repo to build and audit the baseline. |
I developed experimental double precision baseline implementation of element binary options referring to https://github.com/tensorflow/tfjs codes. Here're result shortcuts of running float32 binary tests using WebNN-native backend (DML-GPU and OpenVINO-CPU) under criteria of There're 22 binary tests, 16 Pass and 6 Fail on DML backend, 17 Pass and 5 Fail on DML backend figure-1 testing by DirectML backend (GPU) figure-2 testing by OpenVINO backend (CPU) @wchao1115 @huningxin @dontcallmedom PTAL, thanks. |
Since CPU use Double Precision, I collected max ULP distance for WebNN ops using result of CPU backend (tfjs-backend-cpu based) as the baseline with pr #139 - ULP comparison on three devices.
Here're some observations:
relu(negative number)
on Device 1 & 2 + WebGL backend is-0.0
, distance is2147483648
to the baseline0.0
Open:
relu
operation with negative input, some devices compute out-0.0
while expected is0.0
on CPU backend, the ULP distance between -0.0 and baseline 0.0 is 2147483648, while with non-negative input the max distance is 0, how to decide acceptable ULP distance forrelu
op?Distance details:
The text was updated successfully, but these errors were encountered: