-
Notifications
You must be signed in to change notification settings - Fork 38
Add initial IFRT Julia bindings #738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/xla/Buffer.jl
Outdated
|
|
||
| mutable struct Buffer | ||
| buffer::Ptr{Cvoid} | ||
| holded::Ptr{Cvoid} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should just make a second function HeldBuffer which just contains the heldbuffer pointer, and then for IFRT we shouldn't need to use the old buffer class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm okay, but we need this HeldBuffer inside Buffer to avoid it being destructed when IFRT is using it and Buffer is GCed.
how about this?
| holded::Ptr{Cvoid} | |
| holded::Union{Nothing,HeldBuffer} |
ed714ed to
9a8f9a7
Compare
|
|
||
| mutable struct Buffer | ||
| buffer::Ptr{Cvoid} | ||
| held::Union{Nothing,HeldBuffer} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this?
In essence I feel like we should consider buffer and heldbuffer as two totally different API's. Of course under the hood technically one can be converted to the other, but we shouldn't merge them unless needed. This will make it easier to transition the rest of the code from PJRT -> IFRT (and also the inner buffers having a union will make them slow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I agree with your view that it would be cleaner and more performant. but I'm worried about destruction of the objects and having "use after free" bugs. In this transition period, we will have both Buffer and HeldBuffer.
if Buffer takes ownership of it, then the underlying xla::PjRtBuffer will be freed when Buffer is GCed. if we have a HeldBuffer around, it will be broken because the ptr to which the shared_ptr tries to point will be already freed.
consider the opposite: HeldBuffer takes ownership of it. the same problem applies: HeldBuffer can be GCed before Buffer and Buffer will be broken because the pointer will already be freed if you try to use it again.
here is the dependency graph: both Buffer and HeldBuffer have references to the same xla::PjRtBuffer object.
flowchart TB
subgraph Julia
Buffer
HeldBuffer
end
Buffer --> PjRtBuffer
HeldBuffer --> shared_ptr
shared_ptr --> PjRtBuffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO in order to avoid these issues, the first implementation was best:
- it avoids double-free and use-after-free issues
- both
HeldBufferandBufferare separate types which makes incremental transition easier - no overheads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should never have a heldbuffer which points to the same data as a regular buffer. Each should be considered an exclusive owner of its underyling data.
And analagously we should avoid making a held buffer out of an existing pjrt buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that way there would never be any use after free issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that means that we would be duplicating data, because PjRtBuffer holds raw data, which will be costly
mmmm or we will need to replicate stuff for HeldBuffer. lemme try this weekend
| mutable struct Client | ||
| client::Ptr{Cvoid} | ||
| global_ordinals::Vector{Cint} | ||
| holded::Ptr{Cvoid} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment here above having a separation
ideally we have
abstract struct Client
struct PJRTClient <: Client
...
end
struct IFRTClient <: Client
...
end
returns