-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat support host memory #9928
Feat support host memory #9928
Conversation
…feat_support_host_memory_in_lazy_mode
…b.com/Oneflow-Inc/oneflow into feat_support_host_memory_in_lazy_mode
…feat_support_host_memory_in_lazy_mode
oneflow/core/framework/op_interpreter/eager_global_op_interpreter.cpp
Outdated
Show resolved
Hide resolved
Symbol<ParallelDesc> dst_parallel_desc = | ||
is_host_input | ||
? JUST(ReplaceDeviceType(infered_input_meta->parallel_desc(), DeviceType::kCPU)) | ||
: infered_input_meta->parallel_desc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当op的输入为HostMemory类型时,boxing_out_ parallel_desc的类型设置为cpu
oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp
Outdated
Show resolved
Hide resolved
const auto& host_input = JUST(functional::To( | ||
inputs.at(i), Optional<Symbol<Device>>(JUST(GetDefaultCpuDevice())), NullOpt, false)); | ||
input_eager_blob_objects.at(i) = JUST(host_input->eager_blob_object()); | ||
host_inputs.emplace_back(host_input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
延长host_input的生命周期,防止其被过析构
oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp
Outdated
Show resolved
Hide resolved
Speed stats:
|
…feat_support_host_memory_in_lazy_mode
Speed stats:
|
Speed stats:
|
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9928/ |
实现HostMemoryInput机制,可以将op的某个输入定义为HostMemoryInput类型,定义方式如下:
当被定义为HostMemoryInput时,可以直接在kernel的host函数体内访问数据。