-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src:cpu:aarch64 restricts the conv operation to just one post-op when… #1689
Conversation
… the src/wei/dst data type is fp16
Hi @renato-arantes thanks for the patch.
|
Hi @mgouicem The root cause is that OneDNN conducts all fp16 post-op operations in fp32 and then downgrades to fp16 at the end, leading to a rounding error. To bypass this issue, we decided to limit the conv operation to a single post-op when the src/wei/dst data type is fp16. Here are two
I have several more if you need it. |
This is the expected behavior in oneDNN, and this is exactly what benchdnn checks. I just confirmed that these two tests pass on x64 implementation, which has that exact behavior (computing post-ops in f32).
Could you clarify how this remove the f32->f16 downconversion? I would expect that f32->f16 conversion to be here independently of the number of post-ops since convolution accumulates to f32 anyway. |
Hi @mgouicem,
Both fail for me as this patch is for aarch64 and not for x64! You can check that my patch only changes files under Here is the
I built oneDNN from the |
Hi @mgouicem,
ACL does all post-op in f16 when demanded by This patch doesn't remove the down conversation. But in the case of aarch64, limits the number of post-op operations to just one, so in this case, the probability of having a failure comparison with the ACL result due to the f32->f16 downcast is lower. Currently, I haven't seen any test with just one post-op that fails. |
Sorry if my point was not clear. What I meant is that if post-ops are run in f32, and the down-conversion happens only once before writting to dst (like in x64 implementation), then there should be no failure in benchdnn for those test cases.
That makes sense that the issue would be post-ops using f16 instead of f32. There were a few discussions on this topic, and the plan was to rely on f32 post-ops for ACL (see this PR and this PR). Was there any change with respect to this?
Hence my concern: this PR does not seem to address the actual issue (the use of f16 in postops), but mostly hides the issue (benchdnn uses a higher threshold when post-ops are used, so with 1 post-ops, not much error is compounded and the test passes). |
Discussed this PR with @milpuz01 , @dzarukin and @cfRod :
|
API to manage accumulation mode is now merged to main branch. Please decide whether you want to continue working on this PR or open a new one. |
Closing as stale. |
Description
When using fp16 on AArch64 there are multiple failed conv tests due to a precision problem in oneDNN's fp16 operations. OneDNN conducts all fp16 post-op operations in fp32 and then downgrades to fp16 at the end, leading to a rounding error. To resolve this issue, this patch limits the conv operation to a single post-op when the src/wei/dst data type is fp16.
Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?