Skip to content

Conversation

@rymurr
Copy link
Contributor

@rymurr rymurr commented Mar 31, 2020

This attempts to complete ARROW-8230. However pulling Netty out of arrow-memory is much harder than I had anticipated.

The code changes are extensive and not altogether nice. I personally think we have to break down the task of removing Netty even further and I am posting this PR to get some feedback

This PR:

  • removes trivial dependencies on netty eg: PlatformDependent from arrow-memory
  • creates a new module arrow-memory-netty and moves netty specific stuff there
  • a number of methods were recently deprecated, they have been removed here but I think its too early
  • add a TrivialAllocationManager for when Netty isn't available
  • attempt to deal with instantiating a NettyAllocationManager when available and falling back to TrivialAllocationManager when it isn't
  • attempt to deal with the as yet outstanding hard references to Netty objects

I am not overly happy with this approach and would love some feedback about how to break this down further. @liyafan82 and @jacques-n especially as they are the reporters of this issue and its parent.

Note: most arrow-memory tests are broken on this branch and a number of other modules are not compiling either
Note: I have left ArrowBuf under io.netty.buffer as this will be moved out by #6729

This creates a new module `arrow-memory-netty` and attempts to move all
netty specific references to this module.
@github-actions
Copy link

@liyafan82
Copy link
Contributor

Hi @rymurr Thanks for the quick response.
I briefly looked at the PR, and have some high-level comments:

  1. Maybe there is no need to remove the netty dependency from the memory module in this PR, as it is not ready yet (IMO), and lots of changes are required.
  2. Maybe there is no need for the TrivialAllocationManager, as we are in the process of providing another allocator independent of netty and capable of supporting 64-bit buffers (https://github.com/apache/arrow/pull/6323/files#diff-26fb642f66a09df6024fa977ac762199)

IMO, this PR should perform the following tasks

  1. Create a separate module (arrow-memory-netty), and make it depend on arrow-memory.
  2. Move netty related classes to the new module. This may cause some cyclic dependencies, and we can work around this by leaving the old classes there, and make them deprecated. They should be removed in a future PR.

@rymurr
Copy link
Contributor Author

rymurr commented Apr 2, 2020

hey @liyafan82 I appreciate the feedback!

I agree, I will open a ticket for the removal of netty from arrow-memory for another PR. I had not seen the UnsafeAllocationManagerand it definitely makes more sense than mine. I will wait till that PR is merged and rebase against it.

As for the cyclic dependencies maven won't allow arrow-memory and arrow-memory-netty to directly depend on each other directly so the separation into a new module can't happen until we are able to cut the netty dependency completely. This means removing anything referencing Netty Buffers completely from things like BaseAllocator.

One way forward would be to introduce two new modules arrow-memory-netty and arrow-memory-base. Then arrow-memory-netty would depend on arrow-memory-base and arrow-memory would depend on both. We could separate out the netty specific stuff more easily and once the break is complete we would delete arrrow-memory-base and just have arrow-memory-netty depend on arrow-memory. I think this is a bit much for such a change though and am not a huge fan.

My best way forward is to:

  1. Raise a PR to remove direct netty references from everything but the classes under the io.netty.buffer package, NettyAllocationManager and ArrowByteBufAllocator.
  2. Remove NettyAllocationManager hard references from BaseAllocator and other classes that don't require netty and aren't already marked @Deprecated
  3. Wait until the @Deprecated methods have been publicly deprecated for long enough (I believe they were deprecated only recently and were not in the 0.16.0 release). Perhaps just 0.17.0 is enough and we can remove them for 1.0.0? Otherwise we can wait longer, however there is something nice about removing the hard netty requirement before we release 1.0.0.
  4. Remove @Deprecated methods, create arrow-memory-netty and remove netty dependencies from arrow-memory
  5. Celebrate 🎉

What do you think @liyafan82 ? If you agree I will close this PR and begin on 1. from above.

@liyafan82
Copy link
Contributor

@rymurr Your plan sounds reasonable.
I mostly agree with your points, with some minor concerns:

  1. I am not sure if we need to create the arrow-memory-base module. However, to choose between different allocators, we need some code that has access to both arrow-memory and arrow-memory-netty. Maybe we can have a discussion by then.

  2. I am not sure how long should we wait before completely removing the depecated stuffs.

Anyway, it is ok to start removing netty references now.

@rymurr
Copy link
Contributor Author

rymurr commented Apr 3, 2020

Hey @liyafan82 thanks for the comments. I agree, I don't think we should create the arrow-memory-base module. I think the simplest (maybe not the best) way to choose between allocators is via reflection. Something more elegant than, but along the same lines as how I instantiated different allocators in the RootAllocator in this PR.

In the meantime I will close this PR and start one to minimize Netty usage in arrow-memory and we can revisit when we feel the @Deprecated methods have been around long enough,

@rymurr rymurr closed this Apr 3, 2020
@liyafan82
Copy link
Contributor

Hey @liyafan82 thanks for the comments. I agree, I don't think we should create the arrow-memory-base module. I think the simplest (maybe not the best) way to choose between allocators is via reflection. Something more elegant than, but along the same lines as how I instantiated different allocators in the RootAllocator in this PR.

In the meantime I will close this PR and start one to minimize Netty usage in arrow-memory and we can revisit when we feel the @Deprecated methods have been around long enough,

Sure, reflection is a feasible solution. Please go ahead with minimizing netty usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants