-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement as strided #7275
implement as strided #7275
Conversation
lcylcy
commented
Jan 17, 2022
•
edited
Loading
edited
const std::vector<int32_t>& stride, | ||
const int32_t& storage_offset) const { | ||
MutableAttrMap attrs; | ||
JUST(attrs.SetAttr<std::vector<int32_t>>("size", size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要检查size和stride的vector size和element value的合法性吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
这个标题 |
add_docstr( | ||
oneflow.as_strided, | ||
r""" | ||
Create a view of an existing torch.Tensor input with specified size, stride and storage_offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch.Tensor
-> oneflow.Tensor
,这种基本上从pytorch那里copy过来的文档,最好加注一下该文档来自pytorch。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
好的
…------------------ 原始邮件 ------------------
发件人: "Houjiang ***@***.***>;
发送时间: 2022年1月21日(星期五) 下午4:16
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [Oneflow-Inc/oneflow] Lcy as strided (PR #7275)
@hjchen2 commented on this pull request.
In python/oneflow/framework/docstr/math_ops.py:
> @@ -1296,6 +1296,33 @@ """, ) +add_docstr( + oneflow.as_strided, + r""" + Create a view of an existing torch.Tensor input with specified size, stride and storage_offset.
torch.Tensor -> oneflow.Tensor,这种基本上从pytorch那里copy过来的文档,最好加注一下该文档来自pytorch。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
在哪里改啊 |
const std::shared_ptr<one::Tensor>& input, | ||
const std::vector<int32_t>& size, | ||
const std::vector<int32_t>& stride, | ||
const int32_t& storage_offset) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里和上面的AsStridedFunctor中的const int32_t& storage_offset
感觉改成int32_t storage_offset
更好,
- 如果从传引用比传值效率高的层面考虑,那对于基础数据类型,没有这种收益,因为传引用底层实际还是传的指针,64位的sizeof(指针)是8,反而比直接传int32_t开销大。
- 如果是想加个const qualifier,避免函数体内部的修改,那么不用引用修饰符也可以,但一般没必要
- 仅仅从少打一些字符的角度,也是建议语句尽可能简短
- 我知道functor里有很多类似的写法,这样完全没问题,但也不表示就一定是最优
上面仅仅是个人看法,仅用来探讨,改不改都行
ctx->size = JUST(composed_attrs.GetAttr<std::vector<int32_t> >("size")); | ||
ctx->stride = JUST(composed_attrs.GetAttr<std::vector<int32_t> >("stride")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的两个> >
中间的空格,是c++98时代的遗留问题,现在已经不存在啦
template<typename T> | ||
struct AsStridedFunctor final { | ||
Maybe<void> operator()(ep::Stream* stream, const T* input_buf, T* output_buf, const int64_t* dest_dims, const int32_t* stride, | ||
const int32_t dest_num_dims, const int32_t storage_offset, const int32_t input_num, const int32_t output_num) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里对于基础数据类型的传值,建议不用const qualifier,函数很短,可以不用如此小心,你自己不会去改,别人在用到的时候也不会去随便改
这里可以不用改的,仅作探讨,看下你是怎么想的哈~
template<typename T> | ||
struct AsStridedGradFunctor final { | ||
Maybe<void> operator()(ep::Stream* stream, const T* dy_buf, T* dx_buf, const int64_t* dy_dims, const int32_t* stride, | ||
const int32_t dy_num_dims, const int32_t storage_offset, const int32_t dx_num, const int32_t dy_num) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
~CpuAsStridedKernel() = default; | ||
|
||
private: | ||
void Compute(user_op::KernelComputeContext* ctx) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是应该加上这句声明:
using user_op::OpKernel::Compute;
不然会出现比较大段的warning,cpp文件可能不会,我在用clang编译cu文件时遇到过(大概是你写的diagonal_kernel.cu这个文件),这个问题的原因可以参考这个这个链接:https://stackoverflow.com/questions/21462908/warning-overloaded-virtual-function-baseprocess-is-only-partially-overridde
const auto size = ctx->Attr<std::vector<int32_t>>("size"); | ||
const auto stride = ctx->Attr<std::vector<int32_t>>("stride"); | ||
const int32_t storage_offset = ctx->Attr<int32_t>("storage_offset"); | ||
|
||
size_t dest_num_dims = output->shape().NumAxes(); | ||
const int64_t *dest_dims = output->shape().ptr(); | ||
const size_t input_num = input->shape().Count(0); | ||
const size_t output_num = output->shape().Count(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉这里可以全部不用const,并且对于类型的指定可以尽量不用显式的指定而多用auto,具体的原因可以参考effective modern c++的大概从第一篇到第六篇,当然也可以不改,这里仅提供一下个人看法作为探讨
const auto size = ctx->Attr<std::vector<int32_t>>("size"); | ||
const auto stride = ctx->Attr<std::vector<int32_t>>("stride"); | ||
const int32_t storage_offset = ctx->Attr<int32_t>("storage_offset"); | ||
|
||
size_t dy_num_dims = dy->shape().NumAxes(); | ||
const int64_t *dy_dims = dy->shape().ptr(); | ||
const size_t dx_num = dx->shape().Count(0); | ||
const size_t dy_num = dy->shape().Count(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对于const和auto的使用如前所述,仅作探讨
~GpuAsStridedGradKernel() = default; | ||
|
||
private: | ||
void Compute(user_op::KernelComputeContext* ctx) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里参考前面所说,可能存在warning的问题(clang编译的时候,gcc我没试过)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个warning大概长这个样子,你参考一下:
../oneflow/user/kernels/diagonal_kernel.cu(109): warning: overloaded virtual function "oneflow::user_op::OpKernel::Compute" is only partially overridden in class "oneflow::GpuDiagonalBackwardKernel<half>"
detected during:
instantiation of class "oneflow::GpuDiagonalBackwardKernel<T> [with T=half]"
../oneflow/core/framework/op_kernel.h(330): here
instantiation of "oneflow::user_op::OpKernel *oneflow::user_op::NewOpKernel<T>() [with T=oneflow::GpuDiagonalBackwardKernel<half>]"
../oneflow/core/framework/user_op_kernel_registry.h(84): here
instantiation of "oneflow::user_op::OpKernelRegistry &oneflow::user_op::OpKernelRegistry::SetCreateFn<T>() [with T=oneflow::GpuDiagonalBackwardKernel<half>]"
(152): here
../oneflow/user/kernels/diagonal_kernel.cu(109): warning: overloaded virtual function "oneflow::user_op::OpKernel::Compute" is only partially overridden in class "oneflow::GpuDiagonalBackwardKernel<float>"
detected during:
instantiation of class "oneflow::GpuDiagonalBackwardKernel<T> [with T=float]"
../oneflow/core/framework/op_kernel.h(330): here
instantiation of "oneflow::user_op::OpKernel *oneflow::user_op::NewOpKernel<T>() [with T=oneflow::GpuDiagonalBackwardKernel<float>]"
../oneflow/core/framework/user_op_kernel_registry.h(84): here
instantiation of "oneflow::user_op::OpKernelRegistry &oneflow::user_op::OpKernelRegistry::SetCreateFn<T>() [with T=oneflow::GpuDiagonalBackwardKernel<float>]"
(153): here
../oneflow/user/kernels/diagonal_kernel.cu(109): warning: overloaded virtual function "oneflow::user_op::OpKernel::Compute" is only partially overridden in class "oneflow::GpuDiagonalBackwardKernel<double>"
detected during:
instantiation of class "oneflow::GpuDiagonalBackwardKernel<T> [with T=double]"
../oneflow/core/framework/op_kernel.h(330): here
instantiation of "oneflow::user_op::OpKernel *oneflow::user_op::NewOpKernel<T>() [with T=oneflow::GpuDiagonalBackwardKernel<double>]"
../oneflow/core/framework/user_op_kernel_registry.h(84): here
instantiation of "oneflow::user_op::OpKernelRegistry &oneflow::user_op::OpKernelRegistry::SetCreateFn<T>() [with T=oneflow::GpuDiagonalBackwardKernel<double>]"
(154): here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,这个就是我之前写的,原来如此
device = random_device() | ||
x = random_pytorch_tensor( | ||
ndim=4, | ||
dim1=random(3, 6), | ||
dim2=random(3, 6), | ||
dim3=random(3, 6), | ||
dim4=random(3, 6), | ||
).to(device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个op我不知道是不是1、2、3、4、5维都支持,如果都支持的话,可以用随机的方法,把每个维度都测一下,可以参考这段代码:
# random ndim in range [1,5]
ndim = np.random.randint(1, 6)
dim0 = np.random.randint(4, 10) * 8
dim1 = np.random.randint(4, 10) * 8
dim2 = np.random.randint(4, 10) * 8
dim3 = np.random.randint(4, 10) * 8
dim4 = np.random.randint(4, 10) * 8
if ndim==1:
x = random_pytorch_tensor(1, dim0)
elif ndim==2:
x = random_pytorch_tensor(2, dim0, dim1)
elif ndim==3:
x = random_pytorch_tensor(3, dim0, dim1, dim2)
elif ndim==4:
x = random_pytorch_tensor(4, dim0, dim1, dim2, dim3)
elif ndim==5:
x = random_pytorch_tensor(5, dim0, dim1, dim2, dim3, dim4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的
>>> import oneflow as flow | ||
|
||
>>> input = flow.rand(2,3,5) | ||
>>> output = flow.as_strided(input, (2,3,3), (1,2,3), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文档这部分需要把input和output的内容show出来吗?我看pytorch的torch.as_strided会显示出来,咱们的规定是怎么样的呢,根据咱们这边的要求来就行,仅作参考
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个好像没有规定
Speed stats:
|
CI failed when running job: cuda-speed-test. PR label automerge has been removed |
Speed stats:
|