-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SelectByIndex and improve minor of SelectByMask #5150
Add SelectByIndex and improve minor of SelectByMask #5150
Conversation
Thanks for submitting this pull request! The maintainers of this repository would appreciate if you could update the CHANGELOG.md based on your changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general with some comments. After the boolean type change, the tensor-based can have some performance gains.
Reviewed 3 of 4 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)
cpp/open3d/t/geometry/PointCloud.h
line 327 at r3 (raw file):
/// output point cloud. ///
Remove blank line.
cpp/open3d/t/geometry/PointCloud.h
line 328 at r3 (raw file):
/// /// \param indices Int32 indexing tensor of shape {n,} containing
Int64? Also mention how we handle duplicated indices.
cpp/open3d/t/geometry/PointCloud.h
line 332 at r3 (raw file):
/// \param invert Set to `True` to invert the selection of indices. PointCloud SelectByIndex(const core::Tensor &indices, bool invert = false) const;
It might not always be desirable to remove duplicated indices automatically, e.g., people may expect the output point cloud length to be the same as the input point cloud length.
Shall we add an additional parameter called remove_duplicates = false
, and by default, we do not remove the duplicates? This can also improve the default performance.
cpp/open3d/t/geometry/PointCloud.cpp
line 256 at r3 (raw file):
bool invert /* = false */) const { const int64_t length = GetPointPositions().GetLength(); core::AssertTensorDtype(indices, core::Dtype::Int64);
Use core::Int64
instead of core::Dtype::xxx
. Same for the rest.
cpp/open3d/t/geometry/PointCloud.cpp
line 265 at r3 (raw file):
core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice()); mask.SetItem(core::TensorKey::IndexTensor(indices), core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16,
We need some code change to allow Bool
dtype.
diff --git a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
index 329f78091..ca35e3aff 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
+++ b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
@@ -71,7 +71,7 @@ void IndexGetCPU(const Tensor& src,
CPUCopyObjectElementKernel(src, dst, object_byte_size);
});
} else {
- DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+ DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
});
}
@@ -91,7 +91,7 @@ void IndexSetCPU(const Tensor& src,
CPUCopyObjectElementKernel(src, dst, object_byte_size);
});
} else {
- DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+ DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
});
}
diff --git a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
index ef0de0c1a..547e943ca 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
+++ b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
@@ -80,7 +80,7 @@ void IndexGetCUDA(const Tensor& src,
CUDACopyObjectElementKernel(src, dst, object_byte_size);
});
} else {
- DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+ DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
LaunchAdvancedIndexerKernel(
src.GetDevice(), ai,
// Need to wrap as extended CUDA lambda function
@@ -108,7 +108,7 @@ void IndexSetCUDA(const Tensor& src,
CUDACopyObjectElementKernel(src, dst, object_byte_size);
});
} else {
- DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+ DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
LaunchAdvancedIndexerKernel(
src.GetDevice(), ai,
// Need to wrap as extended CUDA lambda function
diff --git a/cpp/open3d/t/geometry/PointCloud.cpp b/cpp/open3d/t/geometry/PointCloud.cpp
index 96547c01a..04fa4f158 100644
--- a/cpp/open3d/t/geometry/PointCloud.cpp
+++ b/cpp/open3d/t/geometry/PointCloud.cpp
@@ -259,13 +259,11 @@ PointCloud PointCloud::SelectByIndex(const core::Tensor &indices,
// The indices may have duplicate index value and will result in identity
// point cloud attributes. We convert indices Tensor into mask Tensor to and
// call SelectByMask to avoid this situation.
- core::Tensor mask =
- core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice());
+ core::Tensor mask = core::Tensor::Zeros({length}, core::Bool, GetDevice());
mask.SetItem(core::TensorKey::IndexTensor(indices),
- core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16,
- GetDevice()));
+ core::Tensor::Init<bool>(true, GetDevice()));
- PointCloud pcd = SelectByMask(mask.To(core::Dtype::Bool), invert);
+ PointCloud pcd = SelectByMask(mask, invert);
utility::LogDebug("Pointcloud down sampled from {} points to {} points.",
length, pcd.GetPointPositions().GetLength());
After this, we avoid some copies and type conversions. Although, it is still 2x slower than the legacy on my machine.
cpp/tests/t/geometry/PointCloud.cpp
line 707 at r3 (raw file):
{0.2, 0.4, 0.2}}, device)); const core::Tensor indices = core::Tensor::Init<int64_t>({0, 3}, device);
Add another test case to test duplicated indices, and test the remove_duplicates=xxx
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)
cpp/open3d/t/geometry/PointCloud.h
line 327 at r3 (raw file):
Previously, yxlao (Yixing Lao) wrote…
Remove blank line.
Done
cpp/open3d/t/geometry/PointCloud.h
line 332 at r3 (raw file):
Previously, yxlao (Yixing Lao) wrote…
It might not always be desirable to remove duplicated indices automatically, e.g., people may expect the output point cloud length to be the same as the input point cloud length.
Shall we add an additional parameter called
remove_duplicates = false
, and by default, we do not remove the duplicates? This can also improve the default performance.
Agree with that because sometimes we need duplicated values.
I have two considerations:
- Adding
remove_duplicates
will make it different with legacy version ofSelectByIndex
, which theremove_duplicates
is not suported. Should we make them compatible? - I think
remove_duplicates
andinvert
would not be used together. That is , if user want to keep duplicates, he will never turninvert
to true. So now that we haveSelectByMask
, which is also supprtinvert
, could we removeinvert
fromSelectByIndex
and obly keepremove_duplicates
as arg. We can useGetItem
of Tensor, which is faster thanIndexGet
, and useHashSet
for duplicated indices removal. But one thing should be consider, if points size is larger than 1M and the the number of duplicated value is very small (eg. only 1 duplicated index), the performance ofInset
is bad (I have ran the benchmark, it abaout 0.25 - 0.3s). Should we look at other hashing method?
cpp/open3d/t/geometry/PointCloud.cpp
line 256 at r3 (raw file):
Previously, yxlao (Yixing Lao) wrote…
Use
core::Int64
instead ofcore::Dtype::xxx
. Same for the rest.
Done
cpp/open3d/t/geometry/PointCloud.cpp
line 265 at r3 (raw file):
Previously, yxlao (Yixing Lao) wrote…
We need some code change to allow
Bool
dtype.diff --git a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp index 329f78091..ca35e3aff 100644 --- a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp +++ b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp @@ -71,7 +71,7 @@ void IndexGetCPU(const Tensor& src, CPUCopyObjectElementKernel(src, dst, object_byte_size); }); } else { - DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() { + DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() { LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>); }); } @@ -91,7 +91,7 @@ void IndexSetCPU(const Tensor& src, CPUCopyObjectElementKernel(src, dst, object_byte_size); }); } else { - DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() { + DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() { LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>); }); } diff --git a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu index ef0de0c1a..547e943ca 100644 --- a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu +++ b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu @@ -80,7 +80,7 @@ void IndexGetCUDA(const Tensor& src, CUDACopyObjectElementKernel(src, dst, object_byte_size); }); } else { - DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() { + DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() { LaunchAdvancedIndexerKernel( src.GetDevice(), ai, // Need to wrap as extended CUDA lambda function @@ -108,7 +108,7 @@ void IndexSetCUDA(const Tensor& src, CUDACopyObjectElementKernel(src, dst, object_byte_size); }); } else { - DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() { + DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() { LaunchAdvancedIndexerKernel( src.GetDevice(), ai, // Need to wrap as extended CUDA lambda function diff --git a/cpp/open3d/t/geometry/PointCloud.cpp b/cpp/open3d/t/geometry/PointCloud.cpp index 96547c01a..04fa4f158 100644 --- a/cpp/open3d/t/geometry/PointCloud.cpp +++ b/cpp/open3d/t/geometry/PointCloud.cpp @@ -259,13 +259,11 @@ PointCloud PointCloud::SelectByIndex(const core::Tensor &indices, // The indices may have duplicate index value and will result in identity // point cloud attributes. We convert indices Tensor into mask Tensor to and // call SelectByMask to avoid this situation. - core::Tensor mask = - core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice()); + core::Tensor mask = core::Tensor::Zeros({length}, core::Bool, GetDevice()); mask.SetItem(core::TensorKey::IndexTensor(indices), - core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16, - GetDevice())); + core::Tensor::Init<bool>(true, GetDevice())); - PointCloud pcd = SelectByMask(mask.To(core::Dtype::Bool), invert); + PointCloud pcd = SelectByMask(mask, invert); utility::LogDebug("Pointcloud down sampled from {} points to {} points.", length, pcd.GetPointPositions().GetLength());After this, we avoid some copies and type conversions. Although, it is still 2x slower than the legacy on my machine.
Yes, I also see this code because originally I want to used SetItem
with Bool
Dtype, which is not supported yet.
cpp/tests/t/geometry/PointCloud.cpp
line 707 at r3 (raw file):
Previously, yxlao (Yixing Lao) wrote…
Add another test case to test duplicated indices, and test the
remove_duplicates=xxx
flag.
Will add after determine the modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)
cpp/open3d/t/geometry/PointCloud.h
line 332 at r3 (raw file):
Previously, yuecideng (Yueci Deng) wrote…
Agree with that because sometimes we need duplicated values.
I have two considerations:
- Adding
remove_duplicates
will make it different with legacy version ofSelectByIndex
, which theremove_duplicates
is not suported. Should we make them compatible?- I think
remove_duplicates
andinvert
would not be used together. That is , if user want to keep duplicates, he will never turninvert
to true. So now that we haveSelectByMask
, which is also supprtinvert
, could we removeinvert
fromSelectByIndex
and obly keepremove_duplicates
as arg. We can useGetItem
of Tensor, which is faster thanIndexGet
, and useHashSet
for duplicated indices removal. But one thing should be consider, if points size is larger than 1M and the the number of duplicated value is very small (eg. only 1 duplicated index), the performance ofInset
is bad (I have ran the benchmark, it abaout 0.25 - 0.3s). Should we look at other hashing method?
- It is okay to be different than the legacy.
invert
should precedence overremove_duplicates
, sinceinvert
is a more important property thanremove_duplicates
.- When
invert=true
, the value ofremove_duplicates
is ignored. - When
invert=false
, we value ofremove_duplicates
is honored.
- When
In this way, we can keep both parameters in the SelectByIndex
function.
We can investigate the hashing problem later. For now, it is good to just use the approach that you are using.
cpp/open3d/t/geometry/PointCloud.cpp
line 265 at r3 (raw file):
Previously, yuecideng (Yueci Deng) wrote…
Yes, I also see this code because originally I want to used
SetItem
withBool
Dtype, which is not supported yet.
Could you apply the changes above to the PR?
@@ -241,8 +241,7 @@ PointCloud PointCloud::SelectPoints(const core::Tensor &boolean_mask, | |||
PointCloud pcd(GetDevice()); | |||
for (auto &kv : GetPointAttr()) { | |||
if (HasPointAttr(kv.first)) { | |||
pcd.SetPointAttr(kv.first, | |||
kv.second.IndexGet({indices_local}).Clone()); | |||
pcd.SetPointAttr(kv.first, kv.second.IndexGet({indices_local})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if not cloning
is safe here.
-
Legacy : Eigen always makes a copy.
-
This is a
const
function, returning anew
point cloud, so we might want to clone instead of sharing memory. -
Although sharing memory offers some flexibility and performance gain, it is also potentially dangerous for user applications, where one is not careful about the shared memory. So, we are sharing memory, we should always mention it clearly in the documentation/docstrings.
@yxlao any suggestions here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Tensor::IndexGet
will return a new tensor except when it is a 0-D tensor with nonzero indexer. And it is a const
function as well.
Tensor Tensor::IndexGet(const std::vector<Tensor>& index_tensors) const {
if (NumDims() == 0) {
if (index_tensors.size() != 1) {
utility::LogError(
"A 0-D tensor can only be indexed by a 0-D boolean tensor, "
"but got {} index tensors.",
index_tensors.size());
}
Tensor index_tensor = index_tensors[0];
core::AssertTensorShape(index_tensor, {});
core::AssertTensorDtype(index_tensor, core::Bool);
if (index_tensor.IsNonZero()) {
// E.g. np.array(5)[np.array(True)].
return *this;
} else {
// E.g. np.array(5)[np.array(False)].
// The output tensor becomes 1D of 0 element.
return Tensor(/*shape=*/{0}, GetDtype(), GetDevice());
}
}
AdvancedIndexPreprocessor aip(*this, index_tensors);
Tensor dst = Tensor(aip.GetOutputShape(), dtype_, GetDevice());
kernel::IndexGet(aip.GetTensor(), dst, aip.GetIndexTensors(),
aip.GetIndexedShape(), aip.GetIndexedStrides());
return dst;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit to make Tensor::IndexGet
always return a new tensor, as it should have consistent behavior for all cases.
#ifdef BUILD_CUDA_MODULE | ||
BENCHMARK_CAPTURE(Transform, CUDA, core::Device("CUDA:0")) | ||
->Unit(benchmark::kMillisecond); | ||
BENCHMARK_CAPTURE(SelectByIndex, CUDA, false core::Device("CUDA:0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add ,
after false.
->Unit(benchmark::kMillisecond); | ||
BENCHMARK_CAPTURE(SelectByIndex, | ||
CUDA(remove duplicates), | ||
true core::Device("CUDA:0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add ,
after true.
Merging, as all issues have been addressed. |
This change is