-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sparse conv3d kernel #39879
Add sparse conv3d kernel #39879
Conversation
Thanks for your contribution! |
@@ -40,6 +40,19 @@ inline const DDim InferDenseDims(const DDim& x_dims, | |||
return values_dims; | |||
} | |||
|
|||
template <typename Context> | |||
inline void GetGpuLaunchConfig1D(const Context& dev_ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议直接引用已有的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,这个函数之前还没迁移过来临时写的,后面PR统一改掉。
DenseTensor* rulebook) { | ||
// update padding and dilation | ||
// Currently, only support x.layout is NDHWC, groups = 1 | ||
// if x.layout != NDHWC then transpose(x), transpose(weight) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果layout是channel first,是用户需要自己去transpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前transpose的kernel还没迁移过来,所以准备先在前端借口处理好layout再传下来。
HOSTDEVICE const int& operator[](int i) const { return dims[i]; } | ||
}; | ||
|
||
inline HOSTDEVICE bool Check(const int& x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个文件 CPU 和 GPU 是共用的吗 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的
return (lower >= 0 && lower % stride == 0 && uper < xdim); | ||
} | ||
|
||
inline HOSTDEVICE bool Check(const Dims4D& dims, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
函数加上功能注释
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,下一个GPU代码的PR里再补充
DDim* out_dims) { | ||
PADDLE_ENFORCE_EQ(x_dims.size(), | ||
5, | ||
paddle::platform::errors::InvalidArgument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle::platform::的前缀建议去掉,phi下面有errors的实现
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"the shape of x should be (N, D, H, W, C)")); | ||
PADDLE_ENFORCE_EQ(kernel_dims.size(), | ||
5, | ||
paddle::platform::errors::InvalidArgument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其他的希望也一并修改下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} // namespace phi | ||
|
||
PD_REGISTER_KERNEL( | ||
conv, CPU, ALL_LAYOUT, phi::sparse::Conv3dKernel, float, double) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是应该是sparse_conv?如果这里占了kernelname,原先的conv不能迁移了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
或者这个kenrel是不是只能给SPARSE的layout用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PD_REGISTER_KERNEL(
conv, CPU, SPARSE_COO, phi::sparse::Conv3dKernel, float, double) {
kernel->InputAt(1).SetLayout(phi::DataLayout::NCHW);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如沟通,当前先使用ALL_LAYOUT,等cmake中获取layout后在改成SPARSE_COO:
PD_REGISTER_KERNEL(
sparse_conv3d, CPU, ALL_LAYOUT, phi::sparse::Conv3dKernel, float, double) {
kernel->InputAt(0).SetDataLayout(phi::DataLayout::SPARSE_COO);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
OPs
Describe
本PR是实现phi::sparse::Conv3dKernel 前向代码的CPU串行实现,GPU代码在PR
sparse conv3d采Second算法,算法大体思路:
因为输入数据是稀疏数据,存储在SparseCooTensor中,即只有非0元素列表,其中每个channel数据作为一个基础非0元素,这种情况下很难取一个窗口(比如3x3)的数据,因此该算法通过构造一个输入输出映射表rulebook,然后根据rulebook进行gather、gemm和scatter
构造rulebook:
gather:对kernel每一个位置,根据rulebook中记录的输入数据索引,把对应输入数据收集到一个临时tensor tmp_in(n, in_channels),n表示当前这个位置参与的输入数据个数
gemm:将tmp_in和当前这个位置的kernel数据tmp_kernel(in_channels, out_channels)进行矩阵乘法,得到tmp_out(n, out_channels)
scatter:将所有的tmp_out进行scatter add,即把同一个输出位置的数据进行累加得到最终输出结果。