-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-1206] Support NDArray indexing with None and Ellipsis #13143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kohr-h Thanks for the contribution!
Can you please add tests for this change.
Should I open a JIRA ticket as well? |
@mxnet-label-bot [NDArray, pr-awaiting-review] |
@kohr-h you could create a JIRA here - https://issues.apache.org/jira/projects/MXNET/issues/MXNET-939?filter=allopenissues |
@kohr-h please ping here once you think the PR is ready for review. |
Yes, will do. The whole thing is more complex than anticipated (as usual), in particular with advanced indexing. So it's still work-in-progress. But I'm determined to push it through. |
@mxnet-label-bot update[NDArray, pr-work-in-progress] |
@kohr-h Thanks for the contribution! Here are some resources if you need help:
|
Getting closer. Advanced indexing with |
@kohr-h Thanks for working on this! can you look into failing CI builds? |
@mxnet-label-bot update [pr-work-in-progress, NDArray] |
@mxnet-label-bot update [pr-awaiting-testing, NDArray] |
This is almost finished. There are 2 failing tests for which I have no idea why they're failing and how to fix them. I'd appreciate some help. Here are the stacktraces (ignore the
Both tests fail because tensors with 0s in their shapes are handled incorrectly in the two failing functions |
@szha and @anirudh2290 Could you please have a took at this PR? Thank you! |
@kohr-h There are 7 CI tests failing currently, could you have a look? Thanks! |
Great to see this PR merged, thanks for the "last mile" effort @reminisce and @zoeygxy. Regarding the performance differences, the trend you observed is somewhat to be expected. The checks performed in basic indexing are more comprehensive than before, and in more cases than before, views are being returned. Thus, the cases where views have always been returned are a bit slower because of extra overhead, but the cases that turned from copy to view should amortize easily. The advanced indexing performance could have gone either way, so I'm happy about the improvement there. To see the effect more clearly, I think it would make sense to test very small arrays (pure overhead) and very large arrays (purely copy vs. view). How much are you worried about the performance hit for the view cases? If you really want to be on par with the old speed, you could re-introduce the same special-casing as in the old code, but IMO that would make the code more complex. My implementation strives to be as generic as possible and takes the small performance hit of manipulating tuples and slice objects a bit too much for simple cases. |
@kohr-h Thanks for the response. Since returning views is a very frequent operation in many dataset loaders in MXNet, we have to keep the performance on a par with the previous version. @zoeygxy will keep working on improving the performance of the cases where views are returned. Although such special handling may lead code to look a little bit disorganized, the performance gain is still worth the effort because we don't want to introduce performance regression in training jobs. |
@kohr-h
Please confirm that the line |
@DickJC123 It's part of fe6336d so I'll bounce the question to @reminisce. |
@reminisce Makes sense as a trade-off. |
@DickJC123 I believe so. Thanks for pointing out! I will take care of it. |
…13143) * Support NDArray indexing with None and Ellipsis * Update NDArray.__setitem__ docs with None and Ellipsis * Fix boolean flag in NDArray.__getitem__, add doctests * Add setitem test for None and Ellipsis * Fix wrong slice used, add cases to test_indexing * Revamp NDArray.__getitem__ and __setitem__ * Fix typo in error message of SetSliceOpOutputDimSize * Fix setting of array with integer indices * Fix basic __setitem__ for all test cases * WIP: fixing advanced indexing * REMOVE: printing in tests * Re-implement advanced indexing with None and Ellipsis * Fix lint errors * WIP: fix basic indexing * WIP: fix basic indexing * TEMP: print statements in tests * Fix op.slice with step<0 and end==-1 * Implement copy-free general contiguous indexing * Improve doctest of __getitem__ * Fix missing staticmethod * Remove superfluous _at and _slice * Fix lint errors * WIP: basic indexing * Remove print statements from tests * Fix call into op.slice in basic indexing, add doctest * Print failing index in setitem tests * Simplify implementation of advanced index broadcasting * Better printing for failing setitem tests * Remove list indexing restriction, fix value shape broadcasting * Fix bad string formatting * Fix bug in test_uniform * Test mutability of sliced array if contiguous * Fix whitespace error in matrix_op-inl.h * "Fix" pylint complaints * Temporarily disable failing unittests * Silence another pylint complaint * Fix size-0 array creation * Make scalar tensor assignment test check for IndexError * Re-activate operator tests with 0-size arrays * Use np.compat in tests with zeros in shape or empty shape * Change comment in autograd indexing test * Add more None-containing index tuples to indexing test * Disable None in advanced indexing test since it has not been supported * Fix sanity * Fix ci * Fix unit test failure * Fix __getitem__
…13143) * Support NDArray indexing with None and Ellipsis * Update NDArray.__setitem__ docs with None and Ellipsis * Fix boolean flag in NDArray.__getitem__, add doctests * Add setitem test for None and Ellipsis * Fix wrong slice used, add cases to test_indexing * Revamp NDArray.__getitem__ and __setitem__ * Fix typo in error message of SetSliceOpOutputDimSize * Fix setting of array with integer indices * Fix basic __setitem__ for all test cases * WIP: fixing advanced indexing * REMOVE: printing in tests * Re-implement advanced indexing with None and Ellipsis * Fix lint errors * WIP: fix basic indexing * WIP: fix basic indexing * TEMP: print statements in tests * Fix op.slice with step<0 and end==-1 * Implement copy-free general contiguous indexing * Improve doctest of __getitem__ * Fix missing staticmethod * Remove superfluous _at and _slice * Fix lint errors * WIP: basic indexing * Remove print statements from tests * Fix call into op.slice in basic indexing, add doctest * Print failing index in setitem tests * Simplify implementation of advanced index broadcasting * Better printing for failing setitem tests * Remove list indexing restriction, fix value shape broadcasting * Fix bad string formatting * Fix bug in test_uniform * Test mutability of sliced array if contiguous * Fix whitespace error in matrix_op-inl.h * "Fix" pylint complaints * Temporarily disable failing unittests * Silence another pylint complaint * Fix size-0 array creation * Make scalar tensor assignment test check for IndexError * Re-activate operator tests with 0-size arrays * Use np.compat in tests with zeros in shape or empty shape * Change comment in autograd indexing test * Add more None-containing index tuples to indexing test * Disable None in advanced indexing test since it has not been supported * Fix sanity * Fix ci * Fix unit test failure * Fix __getitem__
…13143) * Support NDArray indexing with None and Ellipsis * Update NDArray.__setitem__ docs with None and Ellipsis * Fix boolean flag in NDArray.__getitem__, add doctests * Add setitem test for None and Ellipsis * Fix wrong slice used, add cases to test_indexing * Revamp NDArray.__getitem__ and __setitem__ * Fix typo in error message of SetSliceOpOutputDimSize * Fix setting of array with integer indices * Fix basic __setitem__ for all test cases * WIP: fixing advanced indexing * REMOVE: printing in tests * Re-implement advanced indexing with None and Ellipsis * Fix lint errors * WIP: fix basic indexing * WIP: fix basic indexing * TEMP: print statements in tests * Fix op.slice with step<0 and end==-1 * Implement copy-free general contiguous indexing * Improve doctest of __getitem__ * Fix missing staticmethod * Remove superfluous _at and _slice * Fix lint errors * WIP: basic indexing * Remove print statements from tests * Fix call into op.slice in basic indexing, add doctest * Print failing index in setitem tests * Simplify implementation of advanced index broadcasting * Better printing for failing setitem tests * Remove list indexing restriction, fix value shape broadcasting * Fix bad string formatting * Fix bug in test_uniform * Test mutability of sliced array if contiguous * Fix whitespace error in matrix_op-inl.h * "Fix" pylint complaints * Temporarily disable failing unittests * Silence another pylint complaint * Fix size-0 array creation * Make scalar tensor assignment test check for IndexError * Re-activate operator tests with 0-size arrays * Use np.compat in tests with zeros in shape or empty shape * Change comment in autograd indexing test * Add more None-containing index tuples to indexing test * Disable None in advanced indexing test since it has not been supported * Fix sanity * Fix ci * Fix unit test failure * Fix __getitem__
…13143) * Support NDArray indexing with None and Ellipsis * Update NDArray.__setitem__ docs with None and Ellipsis * Fix boolean flag in NDArray.__getitem__, add doctests * Add setitem test for None and Ellipsis * Fix wrong slice used, add cases to test_indexing * Revamp NDArray.__getitem__ and __setitem__ * Fix typo in error message of SetSliceOpOutputDimSize * Fix setting of array with integer indices * Fix basic __setitem__ for all test cases * WIP: fixing advanced indexing * REMOVE: printing in tests * Re-implement advanced indexing with None and Ellipsis * Fix lint errors * WIP: fix basic indexing * WIP: fix basic indexing * TEMP: print statements in tests * Fix op.slice with step<0 and end==-1 * Implement copy-free general contiguous indexing * Improve doctest of __getitem__ * Fix missing staticmethod * Remove superfluous _at and _slice * Fix lint errors * WIP: basic indexing * Remove print statements from tests * Fix call into op.slice in basic indexing, add doctest * Print failing index in setitem tests * Simplify implementation of advanced index broadcasting * Better printing for failing setitem tests * Remove list indexing restriction, fix value shape broadcasting * Fix bad string formatting * Fix bug in test_uniform * Test mutability of sliced array if contiguous * Fix whitespace error in matrix_op-inl.h * "Fix" pylint complaints * Temporarily disable failing unittests * Silence another pylint complaint * Fix size-0 array creation * Make scalar tensor assignment test check for IndexError * Re-activate operator tests with 0-size arrays * Use np.compat in tests with zeros in shape or empty shape * Change comment in autograd indexing test * Add more None-containing index tuples to indexing test * Disable None in advanced indexing test since it has not been supported * Fix sanity * Fix ci * Fix unit test failure * Fix __getitem__
Description
This PR adds the capability of indexing
NDArray
withNone
to create new axes, and with...
(Ellipsis
) as a shorthand for an adequate number of:
(slice(None)
).Closes #13134
Closes #13166
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
NDArray.__getitem__
andNDArray.__setitem__
now first expandNone
andEllipsis
entries in indices.__setitem__
ignoresNone
entries, which corresponds to the behavior of NumPy.__getitem__
now always returns a view.Comments
I think that the implementation does not live up to the promise of the documentation:
In particular, the "detection" of contiguousness is not good.
I'll make that an extra issue, but it's related to this PR, so I mention it here.
Remaining TODOs
File "/work/mxnet/tests/python/unittest/test_dgl_graph.py", line 64, in check_compact
(fromtest_uniform_sample
)File "/work/mxnet/tests/python/unittest/test_operator.py", line 4195, in test_empty_tensor
(fromtest_tile
)File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4290, in test_empty_indices
(fromtest_one_hot
)