Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support inlining task arguments #72

Merged
merged 27 commits into from
Sep 5, 2023
Merged

Support inlining task arguments #72

merged 27 commits into from
Sep 5, 2023

Conversation

omus
Copy link
Member

@omus omus commented Aug 31, 2023

Should improve performance for Ray task calls which use small sized arguments.

@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2023

Codecov Report

Merging #72 (c6b0298) into main (5b92046) will increase coverage by 2.01%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##             main      #72      +/-   ##
==========================================
+ Coverage   69.26%   71.27%   +2.01%     
==========================================
  Files           6        6              
  Lines         257      282      +25     
==========================================
+ Hits          178      201      +23     
- Misses         79       81       +2     
Flag Coverage Δ
Ray.jl 71.27% <92.30%> (+2.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
Ray.jl/src/Ray.jl 100.00% <ø> (ø)
Ray.jl/src/runtime.jl 51.53% <92.30%> (+7.33%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@omus
Copy link
Member Author

omus commented Aug 31, 2023

I've been attempting to construct the TaskArgByValue and TaskArgByReference instances in Julia and then pass them directly into
submit_task. However, this is challenging as I need to pass in a std::vector<std::unique_ptr<TaskArg>>.

Compiling this example fails:

size_t demo(const std::vector<std::unique_ptr<std::string>> &vector) {
    return vector.size();
}

JLCXX_MODULE define_julia_module(jlcxx::Module& mod)
{
    mod.method("demo", &demo);
}

Failure:

bazel-out/darwin_arm64-opt/bin/external/libcxxwrap_julia/_virtual_includes/headers/jlcxx/stl.hpp:173:89: error: object of type 'std::__deque_base<std::unique_ptr<std::string>, std::allocator<std::unique_ptr<std::string>>>::value_type' (aka 'std::unique_ptr<std::string>') cannot be assigned because its copy assignment operator is implicitly deleted
    wrapped.method("cxxsetindex!", [](WrappedT& v, const T& val, cxxint_t i) { v[i - 1] = val; });

As reported in: JuliaInterop/CxxWrap.jl#370

I've worked around this issue for now by creating a StdVector of TaskArg pointers and then converting these pointers to std::unique_ptr inside of the C++ code.

One addition challenge was around TaskArg being an abstract C++ class. In particular attempting to use a Julia Vector{CxxRef{TaskArg}} and pushing CxxRef{TaskArgByValueAllocated} values fails as it tries to convert to an abstract type. Using a Vector{Any} does work but when converting to a StdVector we get a StdVector{Any} which is an issue for calling the C++ code. The solution here was to start with an empty StdVector allocated on the Julia side and then push elements to it. This works but required addtional work since CxxWrap was automatically dereferencing my values.

@omus
Copy link
Member Author

omus commented Sep 1, 2023

I merged in #74 here as there was some overlap which makes the diff messy (TIL). Once we merge #74 I'll clean this PR up. Besides that this PR is ready for review.

@omus omus marked this pull request as ready for review September 1, 2023 17:08
@omus
Copy link
Member Author

omus commented Sep 1, 2023

I've rebased these changes to no longer depend on #74

Comment on lines 240 to 246
function transform_task_args(task_args)
task_arg_ptrs = StdVector{CxxPtr{ray_jll.TaskArg}}()
for task_arg in task_args
push!(task_arg_ptrs, CxxRef(task_arg))
end
return task_arg_ptrs
end
Copy link
Member Author

@omus omus Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One liner doesn't currently work:

julia> using Ray, CxxWrap

julia> Ray.init()

julia> task_args = Ray.serialize_args([1,2,3])
3-element Vector{Any}:
 ray_core_worker_julia_jll.TaskArgByValueAllocated(Ptr{Nothing} @0x0000600000042100)
 ray_core_worker_julia_jll.TaskArgByValueAllocated(Ptr{Nothing} @0x0000600000043be0)
 ray_core_worker_julia_jll.TaskArgByValueAllocated(Ptr{Nothing} @0x0000600000042560)

julia> CxxRef.(convert.(ray_jll.TaskArg, task_args))
3-element Vector{CxxRef{ray_core_worker_julia_jll.TaskArg}}:
 CxxRef{ray_core_worker_julia_jll.TaskArg}(Ptr{ray_core_worker_julia_jll.TaskArg} @0x0000600000051300)
 CxxRef{ray_core_worker_julia_jll.TaskArg}(Ptr{ray_core_worker_julia_jll.TaskArg} @0x000060000004de60)
 CxxRef{ray_core_worker_julia_jll.TaskArg}(Ptr{ray_core_worker_julia_jll.TaskArg} @0x000060000004d0e0)

julia> StdVector{CxxPtr{ray_jll.TaskArg}}(CxxRef.(convert.(ray_jll.TaskArg, task_args)))
ERROR: MethodError: no method matching StdVector{CxxPtr{ray_core_worker_julia_jll.TaskArg}}(::Vector{CxxRef{ray_core_worker_julia_jll.TaskArg}})

Closest candidates are:
  StdVector{CxxPtr{ray_core_worker_julia_jll.TaskArg}}()
   @ ray_core_worker_julia_jll ~/.julia/packages/CxxWrap/aXNBY/src/CxxWrap.jl:624

Stacktrace:
 [1] top-level scope
   @ REPL[15]:1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#80

Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any objections here; a few tiny suggestions and clarification questions but otherwise LGTM

Ray.jl/src/runtime.jl Outdated Show resolved Hide resolved
Comment on lines +593 to +601
.method("max_direct_call_object_size", [](RayConfig &config) {
return config.max_direct_call_object_size();
})
.method("task_rpc_inlined_bytes_limit", [](RayConfig &config) {
return config.task_rpc_inlined_bytes_limit();
})
.method("record_ref_creation_sites", [](RayConfig &config) {
return config.record_ref_creation_sites();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do these need to be lambdas? seem pretty straightforward...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure now. I'll try this without the lambdas

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails with this:

wrapper.cc:582:59: error: call to non-static member function without an object argument
        .method("max_direct_call_object_size", RayConfig::max_direct_call_object_size)
                                               ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~

function _submit_task(fd, oids::AbstractVector, serialized_runtime_env_info, resources)
# https://github.com/JuliaInterop/CxxWrap.jl/issues/367
args = isempty(oids) ? StdVector{ObjectID}() : StdVector(oids)
function _submit_task(fd, args, serialized_runtime_env_info, resources::AbstractDict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to remove the AbstractVector restriction here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The args passed in here are a StdVector so if we'd leave the type assertion in we'd have to update it. Mainly this assertion was dropped as this method should only be called if the resources argument is a AbstractDict

Comment on lines +155 to +156
for (auto &task_arg : task_args) {
args.emplace_back(task_arg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the element type is unique_ptr already I'm assuming this will auto-cast to unique_ptr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wait I always forget, emplace_back constructs and pushes...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the element type is unique_ptr already I'm assuming this will auto-cast to unique_ptr?

Just to avoid any possible confusion the task_arg element is just a raw pointer and is not already a unique_ptr

Ray.jl/test/runtime.jl Show resolved Hide resolved
Comment on lines +211 to +215
# Note: The Python `prepare_args_internal` function checks if the `arg` is an
# `ObjectRef` and in that case uses the object ID to directly make a
# `TaskArgByReference`. However, as the `args` here are flattened the `arg` will
# always be a `Pair` (or a list in Python). I suspect this Python code path just
# dead code so we'll exclude it from ours.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be doing this though? the values will be automatically dereferenced on the the task execution side so I'm not so sure; then again, if we don't do it then users may be surprised if they do have to de-reference in their work functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python code I'm referring to is attempting to avoid creating a new ObjectRef here if the user has passed in one as an argument. In all cases Python will dereference the arguments inside the called task. What I was specifically calling out here is that due to flatten_args this optimization will never occur as an arg is never a ObjectRef. At best it would be a list containing an ObjectRef as the second value.

Ray.jl/src/runtime.jl Outdated Show resolved Hide resolved
@omus omus merged commit c630b0f into main Sep 5, 2023
4 checks passed
@omus omus deleted the cv/task-arg-by-value branch September 5, 2023 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants