【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part #68334

NKNaN · 2024-09-20T02:11:44Z

PR Category

User Experience

PR Types

New features

Description

新增 paddle.Tensor.is_coalesced
rfc: PaddlePaddle/community#961

lxd-cumt

LGTM

NKNaN · 2024-09-26T06:54:26Z

这个单测写的有点问题，我再改一下

jeff41404 · 2024-09-26T07:30:08Z

test/legacy_test/test_sparse_is_coalesced.py

+def is_coalesced_naive_static(indices):
+    indices = list(zip(*indices))
+    duplicated_len = len(indices)
+    remove_duplicated_len = len(set(indices))
+    return duplicated_len == remove_duplicated_len


Does this function seem to be unused?

Yes. It has been removed.

NKNaN · 2024-09-27T02:50:09Z

由于 SparseCooTensor 类中的 coalesce_ 属性在实例化时默认都为 false，当直接实例化一个 coalesced sparsecootensor 后 coalesce_ 属性仍会是 false，因此 is_coalesced() 不能直接使用底层的 SparseCooTensor::coalesced() 方法返回 coalesce_ 属性。重新修改了一下实现方式。

jeff41404 · 2024-09-27T04:15:56Z

According to API specifications, each API needs to support both dynamic and static graphs, so is_coalesced needs to support static graph(PIR Value), can refer to pir.cc

NKNaN · 2024-09-27T08:19:30Z

According to API specifications, each API needs to support both dynamic and static graphs, so is_coalesced needs to support static graph(PIR Value), can refer to pir.cc

Could we design the returned value of this api to be a 0-D bool type Tensor?
It seems difficult to make a bool value returned directly in static mode espcially when using existing paddle API or calling any kernel functions to implement is_coalesced.

paddle-ci-bot · 2024-10-05T03:15:10Z

Sorry to inform you that eaacb9e's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

zhwesky2010 · 2024-10-09T03:54:43Z

这种类似于属性API，返回bool值就可以，类似于paddle.Tensor.is_sparse、paddle.Tensor.is_sparse_coo、paddle.Tensor.is_sparse_csr，你要在eager_method中和pir中都实现一遍。

paddle/fluid/pybind/eager_method.cc

paddle/fluid/pybind/pir.cc

关于实现方式，为何不在C++中实现，维护一个coalesced_的Tensor的bool属性并返回就行？Pytorch是这样实现的。

NKNaN · 2024-10-09T05:14:38Z

这种类似于属性API，返回bool值就可以，类似于paddle.Tensor.is_sparse、paddle.Tensor.is_sparse_coo、paddle.Tensor.is_sparse_csr，你要在eager_method中和pir中都实现一遍。

好的

关于实现方式，为何不在C++中实现，维护一个coalesced_的Tensor的bool属性并返回就行？Pytorch是这样实现的。

目前 SaprseCooTensor 底层中有 coalesced_ 这个 bool 属性，默认值是 false，但是目前的问题是在创建 tensor 的时候不会对这个属性做修改，只有调用 coalesce() 方法后才会把它改成 true。如果在创建 tensor 的时候就开始判断 coalesced_ 需不需要修改，这样是不是会影响创建 tensor 的效率？

NKNaN · 2024-10-15T08:37:51Z

另外在维护 coalesced_ 的时候，如何确保静态图中的 coalesced_ 也能被正确设置呢？
要判断是否 coalesced 必须要用 indices，因为 indices 是 Tensor，涉及不同的设备，所以也只能在创建sparsecootensor的kernel里维护这个属性。

如果静态图的 is_coalesced() 接口是这样

.def("is_coalesced",
     [](Value self) {
       auto sparse_coo_tensor_type = self.type().dyn_cast<SparseCooTensorType>();
       if (sparse_coo_tensor_type) {
        return sparse_coo_tensor_type.coalesced();
       }
       return false;
     })

再将 phi::sparse::SparseCooTensorKernel 修改如下，在判断完是否是coalesced之后调用 SetCoalesced 方法。但是静态图中用paddle.sparse.sparse_coo_tensor 创建后不论哪种情况，SparseCooTensorType 的 coalesced_ 属性始终都是默认值。

template <typename T, typename Context>
void CheckCoalesced(const DenseTensor& indices,
                    DDim dims,
                    int64_t sparse_dim,
                    int64_t nnz,
                    bool* coalesced) {
  std::vector<T> sparse_offsets(sparse_dim), x_indexs(nnz);
  const T* ind_data = indices.data<T>();
  T* offset_data = sparse_offsets.data();
  T* x_indexs_data = x_indexs.data();
  T offset = 1;
  for (T i = sparse_dim - 1; i >= 0; i--) {
    offset_data[i] = offset;
    offset *= dims[i];
    printf("spase_offsets[%d]: %d\n", i, offset_data[i]);
  }
  for (int64_t i = 0; i < nnz; ++i) {
    x_indexs_data[i] = 0;
    for (T j = 0; j < sparse_dim; j++) {
      printf("j = %d, ind_data[%d]: %d\n", j, j * nnz + i, ind_data[j * nnz + i]);
      x_indexs_data[i] += ind_data[j * nnz + i] * offset_data[j];
    }
  }
  std::map<T, std::vector<int64_t>> indices_to_index;
  for (uint64_t i = 0; i < x_indexs.size(); i++) {
    T index = x_indexs[i];
    if (indices_to_index.find(index) == indices_to_index.end()) {
      std::vector<int64_t> indexs;
      indexs.push_back(static_cast<int>(i));
      indices_to_index[index] = indexs;
    } else {
      *coalesced = false;
      return;
    }
  }
  *coalesced = true;
}

template <typename Context>
void CheckAndSetCoalesced(const Context& dev_ctx,
                          SparseCooTensor* out) {
  bool coalesced = false;
  DenseTensor indices = out->indices();
  auto dims = out->dims();
  int64_t sparse_dim = static_cast<int64_t>(indices.dims()[0]);
  int64_t nnz = out->nnz();
  PD_VISIT_BASE_INTEGRAL_TYPES(
      indices.dtype(), "CheckCoalesced", ([&] {
        CheckCoalesced<data_t, Context>(indices, dims, sparse_dim, nnz, &coalesced);
      }));
  out->SetCoalesced(coalesced);
}

template <typename T, typename Context>
void SparseCooTensorKernel(const Context& dev_ctx,
                           const DenseTensor& values,
                           const DenseTensor& indices,
                           const std::vector<int64_t>& shape,
                           SparseCooTensor* out) {
  *out = SparseCooTensor(indices, values, common::make_ddim(shape));
  CheckAndSetCoalesced<Context>(dev_ctx, out);
}

是不是应该在静态图的 is_coalesced() 接口里面获取到 IrSparseCooTensor 或者其他对应的数据类型，应该如何转换呢？

zhwesky2010 · 2024-10-16T04:39:52Z

这个是不是想的太复杂了，应该就是维护一个属性就可以：

NKNaN · 2024-10-16T06:06:45Z

这个是不是想的太复杂了，应该就是维护一个属性就可以：

好的，我明白了

zhwesky2010

LGTM

SigureMo · 2024-10-25T04:33:48Z

paddle/fluid/pybind/eager_method.cc

@@ -3503,6 +3559,10 @@ PyMethodDef variable_methods[] = {  // NOLINT
     (PyCFunction)(void (*)())tensor_method_to_sparse_csr,
     METH_VARARGS | METH_KEYWORDS,
     tensor_to_sparse_csr__doc__},
+    {"is_coalesced",


需要在 python/paddle/tensor/tensor.prototype.pyi stub 中补充新 Tensor API 类型

好的，已补充

SigureMo · 2024-10-25T04:36:49Z

test/legacy_test/test_sparse_is_coalesced.py

+        self.expected_result = [False, False]
+
+    def test_is_coalesced(self):
+        if in_pir_mode():


这里有测不到的风险，in_pir_mode 是 use_pir_api and in_static_mode，如果此时不是静态图模式就会有测不到的风险

因此建议改为 use_pir_api 或者直接删掉该条件，现在默认就是 PIR 模式

已删除 in_pir_mode

SigureMo · 2024-10-25T04:46:53Z

paddle/fluid/pybind/eager_method.cc

+        std::dynamic_pointer_cast<phi::SparseCooTensor>(self->tensor.impl());
+    return ToPyObject(sparse_coo_tensor->coalesced());
+  } else {
+    return ToPyObject(false);


只是一个疑问，这里传入错误类型的情况下，感觉报错比返回 False 更好？这点在原来的设计中是有考虑到的么？

这个是参考 pytorch 的设计

Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> x = torch.as_tensor([[1., 2., 3.]]) >>> x.is_coalesced() Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: is_coalesced expected sparse coordinate tensor layout but got Strided >>>

可是 PyTorch 貌似是报错？

好的，我再改一下吧

paddle/fluid/pybind/eager_method.cc

paddle/fluid/pybind/pir.cc

test/legacy_test/test_sparse_is_coalesced.py

Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>

SigureMo

LGTMeow

jeff41404

LGTM

luotao1 · 2024-10-28T08:37:58Z

请提交对应的中文文档

* add is_coalesced * update test * fix test_tensor_attr_consistency * delete pir mode test and fix docs code example * add is_coalesced * update test * fix test_tensor_attr_consistency * delete pir mode test and fix docs code example * fix test * update * update * delete unnecessary * fix test * update test * update: raise error if not sparsecootensor * Update paddle/fluid/pybind/eager_method.cc Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com> * Apply suggestions from code review Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com> --------- Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>

add is_coalesced

ebf8490

NKNaN changed the title ~~Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part~~ 【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part Sep 20, 2024

update test

1d19f8c

luotao1 mentioned this pull request Sep 20, 2024

【Hackathon 7th】开源贡献个人挑战赛 #68244

Open

fix test_tensor_attr_consistency

0fe9c86

NKNaN requested review from SigureMo, Aurelius84 and gouzil as code owners September 20, 2024 12:01

paddle-bot bot added the contributor External developers label Sep 20, 2024

delete pir mode test and fix docs code example

a3d11ba

luotao1 added the PaddlePaddle Hackathon label Sep 23, 2024

luotao1 assigned luotao1 and lxd-cumt Sep 23, 2024

lxd-cumt approved these changes Sep 26, 2024

View reviewed changes

luotao1 assigned jeff41404 and sunzhongkai588 Sep 26, 2024

NKNaN added 5 commits September 26, 2024 14:55

add is_coalesced

0224f99

update test

33a0a82

fix test_tensor_attr_consistency

c7f3b56

delete pir mode test and fix docs code example

f1ec408

fix test

534795a

jeff41404 reviewed Sep 26, 2024

View reviewed changes

update

eaacb9e

NKNaN force-pushed the is_coalesced branch from a3d11ba to eaacb9e Compare September 27, 2024 02:42

luotao1 added the API label Oct 11, 2024

merge

586c6f0

NKNaN added 3 commits October 22, 2024 14:57

update

a1b4ba8

delete unnecessary

8f220e2

fix test

82f1043

zhwesky2010 previously approved these changes Oct 25, 2024

View reviewed changes

SigureMo reviewed Oct 25, 2024

View reviewed changes

update test

e883993

NKNaN dismissed zhwesky2010’s stale review via e883993 October 25, 2024 07:50

update: raise error if not sparsecootensor

3fe1aeb

SigureMo reviewed Oct 26, 2024

View reviewed changes

NKNaN and others added 2 commits October 27, 2024 20:28

Update paddle/fluid/pybind/eager_method.cc

6f53767

Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>

Apply suggestions from code review

975cd08

Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>

SigureMo approved these changes Oct 27, 2024

View reviewed changes

jeff41404 approved these changes Oct 28, 2024

View reviewed changes

luotao1 merged commit e8eb28b into PaddlePaddle:develop Oct 28, 2024
27 checks passed

NKNaN mentioned this pull request Oct 30, 2024

【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part PaddlePaddle/docs#6925

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part #68334

【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part #68334

NKNaN commented Sep 20, 2024 •

edited

Loading

lxd-cumt left a comment

NKNaN commented Sep 26, 2024

jeff41404 Sep 26, 2024 •

edited

Loading

NKNaN Sep 27, 2024 •

edited

Loading

NKNaN commented Sep 27, 2024 •

edited

Loading

jeff41404 commented Sep 27, 2024

NKNaN commented Sep 27, 2024

paddle-ci-bot bot commented Oct 5, 2024

zhwesky2010 commented Oct 9, 2024 •

edited

Loading

NKNaN commented Oct 9, 2024 •

edited

Loading

NKNaN commented Oct 15, 2024 •

edited

Loading

zhwesky2010 commented Oct 16, 2024

NKNaN commented Oct 16, 2024

zhwesky2010 left a comment

SigureMo Oct 25, 2024

NKNaN Oct 25, 2024

SigureMo Oct 25, 2024

NKNaN Oct 25, 2024

SigureMo Oct 25, 2024

NKNaN Oct 25, 2024

SigureMo Oct 25, 2024

NKNaN Oct 25, 2024

SigureMo left a comment

jeff41404 left a comment

luotao1 commented Oct 28, 2024

【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part #68334

【Hackathon 7th No.25】为 Paddle 新增 is_coalesced -part #68334

Conversation

NKNaN commented Sep 20, 2024 • edited Loading

PR Category

PR Types

Description

lxd-cumt left a comment

Choose a reason for hiding this comment

NKNaN commented Sep 26, 2024

jeff41404 Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

NKNaN Sep 27, 2024 • edited Loading

Choose a reason for hiding this comment

NKNaN commented Sep 27, 2024 • edited Loading

jeff41404 commented Sep 27, 2024

NKNaN commented Sep 27, 2024

paddle-ci-bot bot commented Oct 5, 2024

zhwesky2010 commented Oct 9, 2024 • edited Loading

NKNaN commented Oct 9, 2024 • edited Loading

NKNaN commented Oct 15, 2024 • edited Loading

zhwesky2010 commented Oct 16, 2024

NKNaN commented Oct 16, 2024

zhwesky2010 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SigureMo left a comment

Choose a reason for hiding this comment

jeff41404 left a comment

Choose a reason for hiding this comment

luotao1 commented Oct 28, 2024

NKNaN commented Sep 20, 2024 •

edited

Loading

jeff41404 Sep 26, 2024 •

edited

Loading

NKNaN Sep 27, 2024 •

edited

Loading

NKNaN commented Sep 27, 2024 •

edited

Loading

zhwesky2010 commented Oct 9, 2024 •

edited

Loading

NKNaN commented Oct 9, 2024 •

edited

Loading

NKNaN commented Oct 15, 2024 •

edited

Loading