-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Add Large Tensor Test for linalg_syrk #18782
Conversation
Hey @Zha0q1 , Thanks for submitting the PR
CI supported jobs: [windows-cpu, centos-cpu, windows-gpu, sanity, edge, miscellaneous, clang, centos-gpu, website, unix-gpu, unix-cpu] Note: |
@Zha0q1 can you fix this issue with your PR "This branch is out-of-date with the base branch" |
Fixed. |
tests/nightly/test_large_array.py
Outdated
A.attach_grad() | ||
with mx.autograd.record(): | ||
out = nd.linalg.syrk(A, alpha=2, transpose=False) | ||
for i in range(LARGE_SQ_X): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check in 2 places in (0,y,y) and (1,y,y). No need to check in 70000 locations
tests/nightly/test_large_array.py
Outdated
assert out[0,i,i] == 2 | ||
assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5) | ||
out.backward() | ||
for i in range(LARGE_SQ_X): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
tests/nightly/test_large_array.py
Outdated
for i in range(LARGE_SQ_X): | ||
# check the first row | ||
assert A.grad[0,0,i] == 4 | ||
assert_almost_equal(A.grad[1,0,i], nd.array([0.4]), rtol=1e-3, atol=1e-5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Why did this become 0.4 and not 0.04 ? OR just let me know if this output is consistent with smaller inputs like 2x2 or 3x3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the correct result I believe. I verified with hand-written calculation. Yeah it also struck as counter-intuitive to me.. I am going to dive deep in matrix grad when I find time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just check with smaller input run and let me know the results. That should be good enough
Few comments ... overall code is good |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just check with smaller input run and let me know the results. That should be good enough .... overall LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks!
This has been tested with both small and large input tensors |
@@ -37,7 +37,7 @@ | |||
LARGE_X = 100000000 | |||
SMALL_X = 100 | |||
SMALL_Y = 50 | |||
LARGE_SQ_X = 80000 | |||
LARGE_SQ_X = 70000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
70000*70000/2**32
is just over 2**32
80k is lot more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We figured 7000 is large enough to overflow int 32. This is a new constant just introduced so the few of us decided to tweak it to 70000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
70k * 70k = 4.9 billion
int32 range < 4.3 billion
Therefore increasing input size will only increase test runtime.
@Zha0q1 also add your name to Contributors.md in the upcoming PR [let's not retrigger CI for that]. |
will do! |
* add large tensor test for syrk, foward and backward * change to batch input * move syrk test into test-linalg Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-47.us-west-2.compute.internal>
Description
Check if syrk_batch works correctly with large tensors. This test will fail with the current code base; #18752 should fix it.
TODO:
This test passes on both BLAS int32 and 64 builds.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments