Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SelectByIndex and improve minor of SelectByMask #5150

Merged
merged 7 commits into from
Jun 10, 2022

Conversation

yuecideng
Copy link
Collaborator

@yuecideng yuecideng commented Jun 6, 2022

  1. Change SelectPoints -> SelectByMask in tensor based PointCloud.
  2. Remove tensor clone inside SelectByMask.
  3. Add SelectByIndex (both unit test and pybind) in tensor based PointCloud.

This change is Reviewable

@update-docs
Copy link

update-docs bot commented Jun 6, 2022

Thanks for submitting this pull request! The maintainers of this repository would appreciate if you could update the CHANGELOG.md based on your changes.

@yuecideng
Copy link
Collaborator Author

I also add legacy Transform and SelectByIndex in benchmark.
The Transform works fine for tensor based. (The legacy version can also be parallel using openmp)
image

The SelectByIndex is slow compared with legacy version (tensor based version is wrapped from SelectByMask, original SelectPoints)
image

Copy link
Collaborator

@yxlao yxlao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general with some comments. After the boolean type change, the tensor-based can have some performance gains.

Reviewed 3 of 4 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)


cpp/open3d/t/geometry/PointCloud.h line 327 at r3 (raw file):

    /// output point cloud.
    ///

Remove blank line.


cpp/open3d/t/geometry/PointCloud.h line 328 at r3 (raw file):

    ///

    /// \param indices Int32 indexing tensor of shape {n,} containing

Int64? Also mention how we handle duplicated indices.


cpp/open3d/t/geometry/PointCloud.h line 332 at r3 (raw file):

    /// \param invert Set to `True` to invert the selection of indices.
    PointCloud SelectByIndex(const core::Tensor &indices,
                             bool invert = false) const;

It might not always be desirable to remove duplicated indices automatically, e.g., people may expect the output point cloud length to be the same as the input point cloud length.

Shall we add an additional parameter called remove_duplicates = false, and by default, we do not remove the duplicates? This can also improve the default performance.


cpp/open3d/t/geometry/PointCloud.cpp line 256 at r3 (raw file):

                                     bool invert /* = false */) const {
    const int64_t length = GetPointPositions().GetLength();
    core::AssertTensorDtype(indices, core::Dtype::Int64);

Use core::Int64 instead of core::Dtype::xxx. Same for the rest.


cpp/open3d/t/geometry/PointCloud.cpp line 265 at r3 (raw file):

            core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice());
    mask.SetItem(core::TensorKey::IndexTensor(indices),
                 core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16,

We need some code change to allow Bool dtype.

diff --git a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
index 329f78091..ca35e3aff 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
+++ b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
@@ -71,7 +71,7 @@ void IndexGetCPU(const Tensor& src,
             CPUCopyObjectElementKernel(src, dst, object_byte_size);
         });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
         });
     }
@@ -91,7 +91,7 @@ void IndexSetCPU(const Tensor& src,
             CPUCopyObjectElementKernel(src, dst, object_byte_size);
         });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
         });
     }
diff --git a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
index ef0de0c1a..547e943ca 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
+++ b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
@@ -80,7 +80,7 @@ void IndexGetCUDA(const Tensor& src,
                     CUDACopyObjectElementKernel(src, dst, object_byte_size);
                 });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(
                     src.GetDevice(), ai,
                     // Need to wrap as extended CUDA lambda function
@@ -108,7 +108,7 @@ void IndexSetCUDA(const Tensor& src,
                     CUDACopyObjectElementKernel(src, dst, object_byte_size);
                 });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(
                     src.GetDevice(), ai,
                     // Need to wrap as extended CUDA lambda function
diff --git a/cpp/open3d/t/geometry/PointCloud.cpp b/cpp/open3d/t/geometry/PointCloud.cpp
index 96547c01a..04fa4f158 100644
--- a/cpp/open3d/t/geometry/PointCloud.cpp
+++ b/cpp/open3d/t/geometry/PointCloud.cpp
@@ -259,13 +259,11 @@ PointCloud PointCloud::SelectByIndex(const core::Tensor &indices,
     // The indices may have duplicate index value and will result in identity
     // point cloud attributes. We convert indices Tensor into mask Tensor to and
     // call SelectByMask to avoid this situation.
-    core::Tensor mask =
-            core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice());
+    core::Tensor mask = core::Tensor::Zeros({length}, core::Bool, GetDevice());
     mask.SetItem(core::TensorKey::IndexTensor(indices),
-                 core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16,
-                              GetDevice()));
+                 core::Tensor::Init<bool>(true, GetDevice()));
 
-    PointCloud pcd = SelectByMask(mask.To(core::Dtype::Bool), invert);
+    PointCloud pcd = SelectByMask(mask, invert);
 
     utility::LogDebug("Pointcloud down sampled from {} points to {} points.",
                       length, pcd.GetPointPositions().GetLength());

After this, we avoid some copies and type conversions. Although, it is still 2x slower than the legacy on my machine.


cpp/tests/t/geometry/PointCloud.cpp line 707 at r3 (raw file):

                                       {0.2, 0.4, 0.2}},
                                      device));
    const core::Tensor indices = core::Tensor::Init<int64_t>({0, 3}, device);

Add another test case to test duplicated indices, and test the remove_duplicates=xxx flag.

Copy link
Collaborator Author

@yuecideng yuecideng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)


cpp/open3d/t/geometry/PointCloud.h line 327 at r3 (raw file):

Previously, yxlao (Yixing Lao) wrote…

Remove blank line.

Done


cpp/open3d/t/geometry/PointCloud.h line 332 at r3 (raw file):

Previously, yxlao (Yixing Lao) wrote…

It might not always be desirable to remove duplicated indices automatically, e.g., people may expect the output point cloud length to be the same as the input point cloud length.

Shall we add an additional parameter called remove_duplicates = false, and by default, we do not remove the duplicates? This can also improve the default performance.

Agree with that because sometimes we need duplicated values.

I have two considerations:

  1. Adding remove_duplicates will make it different with legacy version of SelectByIndex, which the remove_duplicates is not suported. Should we make them compatible?
  2. I think remove_duplicates and invert would not be used together. That is , if user want to keep duplicates, he will never turn invert to true. So now that we have SelectByMask, which is also supprt invert, could we remove invert from SelectByIndex and obly keep remove_duplicates as arg. We can use GetItem of Tensor, which is faster than IndexGet, and use HashSet for duplicated indices removal. But one thing should be consider, if points size is larger than 1M and the the number of duplicated value is very small (eg. only 1 duplicated index), the performance of Inset is bad (I have ran the benchmark, it abaout 0.25 - 0.3s). Should we look at other hashing method?

cpp/open3d/t/geometry/PointCloud.cpp line 256 at r3 (raw file):

Previously, yxlao (Yixing Lao) wrote…

Use core::Int64 instead of core::Dtype::xxx. Same for the rest.

Done


cpp/open3d/t/geometry/PointCloud.cpp line 265 at r3 (raw file):

Previously, yxlao (Yixing Lao) wrote…

We need some code change to allow Bool dtype.

diff --git a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
index 329f78091..ca35e3aff 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
+++ b/cpp/open3d/core/kernel/IndexGetSetCPU.cpp
@@ -71,7 +71,7 @@ void IndexGetCPU(const Tensor& src,
             CPUCopyObjectElementKernel(src, dst, object_byte_size);
         });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
         });
     }
@@ -91,7 +91,7 @@ void IndexSetCPU(const Tensor& src,
             CPUCopyObjectElementKernel(src, dst, object_byte_size);
         });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(ai, CPUCopyElementKernel<scalar_t>);
         });
     }
diff --git a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
index ef0de0c1a..547e943ca 100644
--- a/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
+++ b/cpp/open3d/core/kernel/IndexGetSetCUDA.cu
@@ -80,7 +80,7 @@ void IndexGetCUDA(const Tensor& src,
                     CUDACopyObjectElementKernel(src, dst, object_byte_size);
                 });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(
                     src.GetDevice(), ai,
                     // Need to wrap as extended CUDA lambda function
@@ -108,7 +108,7 @@ void IndexSetCUDA(const Tensor& src,
                     CUDACopyObjectElementKernel(src, dst, object_byte_size);
                 });
     } else {
-        DISPATCH_DTYPE_TO_TEMPLATE(dtype, [&]() {
+        DISPATCH_DTYPE_TO_TEMPLATE_WITH_BOOL(dtype, [&]() {
             LaunchAdvancedIndexerKernel(
                     src.GetDevice(), ai,
                     // Need to wrap as extended CUDA lambda function
diff --git a/cpp/open3d/t/geometry/PointCloud.cpp b/cpp/open3d/t/geometry/PointCloud.cpp
index 96547c01a..04fa4f158 100644
--- a/cpp/open3d/t/geometry/PointCloud.cpp
+++ b/cpp/open3d/t/geometry/PointCloud.cpp
@@ -259,13 +259,11 @@ PointCloud PointCloud::SelectByIndex(const core::Tensor &indices,
     // The indices may have duplicate index value and will result in identity
     // point cloud attributes. We convert indices Tensor into mask Tensor to and
     // call SelectByMask to avoid this situation.
-    core::Tensor mask =
-            core::Tensor::Zeros({length}, core::Dtype::Int16, GetDevice());
+    core::Tensor mask = core::Tensor::Zeros({length}, core::Bool, GetDevice());
     mask.SetItem(core::TensorKey::IndexTensor(indices),
-                 core::Tensor(std::vector<int16_t>{1}, {}, core::Dtype::Int16,
-                              GetDevice()));
+                 core::Tensor::Init<bool>(true, GetDevice()));
 
-    PointCloud pcd = SelectByMask(mask.To(core::Dtype::Bool), invert);
+    PointCloud pcd = SelectByMask(mask, invert);
 
     utility::LogDebug("Pointcloud down sampled from {} points to {} points.",
                       length, pcd.GetPointPositions().GetLength());

After this, we avoid some copies and type conversions. Although, it is still 2x slower than the legacy on my machine.

Yes, I also see this code because originally I want to used SetItem with Bool Dtype, which is not supported yet.


cpp/tests/t/geometry/PointCloud.cpp line 707 at r3 (raw file):

Previously, yxlao (Yixing Lao) wrote…

Add another test case to test duplicated indices, and test the remove_duplicates=xxx flag.

Will add after determine the modification.

Copy link
Collaborator

@yxlao yxlao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @yuecideng and @yxlao)


cpp/open3d/t/geometry/PointCloud.h line 332 at r3 (raw file):

Previously, yuecideng (Yueci Deng) wrote…

Agree with that because sometimes we need duplicated values.

I have two considerations:

  1. Adding remove_duplicates will make it different with legacy version of SelectByIndex, which the remove_duplicates is not suported. Should we make them compatible?
  2. I think remove_duplicates and invert would not be used together. That is , if user want to keep duplicates, he will never turn invert to true. So now that we have SelectByMask, which is also supprt invert, could we remove invert from SelectByIndex and obly keep remove_duplicates as arg. We can use GetItem of Tensor, which is faster than IndexGet, and use HashSet for duplicated indices removal. But one thing should be consider, if points size is larger than 1M and the the number of duplicated value is very small (eg. only 1 duplicated index), the performance of Inset is bad (I have ran the benchmark, it abaout 0.25 - 0.3s). Should we look at other hashing method?
  1. It is okay to be different than the legacy.
  2. invert should precedence over remove_duplicates, since invert is a more important property than remove_duplicates.
    • When invert=true, the value of remove_duplicates is ignored.
    • When invert=false, we value of remove_duplicates is honored.

In this way, we can keep both parameters in the SelectByIndex function.

We can investigate the hashing problem later. For now, it is good to just use the approach that you are using.


cpp/open3d/t/geometry/PointCloud.cpp line 265 at r3 (raw file):

Previously, yuecideng (Yueci Deng) wrote…

Yes, I also see this code because originally I want to used SetItem with Bool Dtype, which is not supported yet.

Could you apply the changes above to the PR?

@@ -241,8 +241,7 @@ PointCloud PointCloud::SelectPoints(const core::Tensor &boolean_mask,
PointCloud pcd(GetDevice());
for (auto &kv : GetPointAttr()) {
if (HasPointAttr(kv.first)) {
pcd.SetPointAttr(kv.first,
kv.second.IndexGet({indices_local}).Clone());
pcd.SetPointAttr(kv.first, kv.second.IndexGet({indices_local}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if not cloning is safe here.

  • Legacy : Eigen always makes a copy.

  • This is a const function, returning a new point cloud, so we might want to clone instead of sharing memory.

  • Although sharing memory offers some flexibility and performance gain, it is also potentially dangerous for user applications, where one is not careful about the shared memory. So, we are sharing memory, we should always mention it clearly in the documentation/docstrings.

@yxlao any suggestions here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Tensor::IndexGet will return a new tensor except when it is a 0-D tensor with nonzero indexer. And it is a const function as well.

Tensor Tensor::IndexGet(const std::vector<Tensor>& index_tensors) const {
    if (NumDims() == 0) {
        if (index_tensors.size() != 1) {
            utility::LogError(
                    "A 0-D tensor can only be indexed by a 0-D boolean tensor, "
                    "but got {} index tensors.",
                    index_tensors.size());
        }
        Tensor index_tensor = index_tensors[0];
        core::AssertTensorShape(index_tensor, {});
        core::AssertTensorDtype(index_tensor, core::Bool);

        if (index_tensor.IsNonZero()) {
            // E.g. np.array(5)[np.array(True)].
            return *this;
        } else {
            // E.g. np.array(5)[np.array(False)].
            // The output tensor becomes 1D of 0 element.
            return Tensor(/*shape=*/{0}, GetDtype(), GetDevice());
        }
    }

    AdvancedIndexPreprocessor aip(*this, index_tensors);
    Tensor dst = Tensor(aip.GetOutputShape(), dtype_, GetDevice());

    kernel::IndexGet(aip.GetTensor(), dst, aip.GetIndexTensors(),
                     aip.GetIndexedShape(), aip.GetIndexedStrides());

    return dst;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit to make Tensor::IndexGet always return a new tensor, as it should have consistent behavior for all cases.

#ifdef BUILD_CUDA_MODULE
BENCHMARK_CAPTURE(Transform, CUDA, core::Device("CUDA:0"))
->Unit(benchmark::kMillisecond);
BENCHMARK_CAPTURE(SelectByIndex, CUDA, false core::Device("CUDA:0"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add , after false.

->Unit(benchmark::kMillisecond);
BENCHMARK_CAPTURE(SelectByIndex,
CUDA(remove duplicates),
true core::Device("CUDA:0"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add , after true.

@yxlao
Copy link
Collaborator

yxlao commented Jun 10, 2022

Merging, as all issues have been addressed.

@yxlao yxlao merged commit 46d58ff into isl-org:master Jun 10, 2022
@yuecideng yuecideng deleted the yuecideng/TPointCloudIndexing branch June 11, 2022 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants