Skip to content

Conversation

@wangxiaobaidu11
Copy link
Contributor

@wangxiaobaidu11 wangxiaobaidu11 commented Sep 7, 2021

support hadoop3.2.1 version.

Description

upgrade guava21.0 guice4.0 hadoop3.2.1 version


Key changed/added classes in this PR
  • TaskStatus.java
  • QueryStatus.java
  • BatchAppenderatorDriver.java
  • StreamAppenderatorDriver.java
  • DruidNode.java
  • TaskLockbox.java

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@clintropolis
Copy link
Member

Thanks for the contribution, though this one might be a bit difficult to accept without some additional discussion, see https://lists.apache.org/thread.html/r9e867429b44a694a77ff62351a0e92a4ca06c3f0184b8d9d5019c517%40%3Cdev.druid.apache.org%3E for a dev mailing list discussion that happened earlier this year. Also check out the associated to that thread PR #11314 for an alternative approach that leans in heavy to Hadoop 3 and uses the new shaded client jars instead of the invasive Hadoop dependency set Druid is currently using.

The main problem seems to be that there are still quite a few Hadoop 2.x clusters in the wild, which complicates how we should do this upgrade without breaking Druid for people in that environment. The best would probably be if we could come up with some way to support both Hadoop 2.x and 3.x, and also be free of the hadoop dependency problem, but it doesn't seem to be a trivial amount of effort.

@wangxiaobaidu11
Copy link
Contributor Author

Thanks for the contribution, though this one might be a bit difficult to accept without some additional discussion, see https://lists.apache.org/thread.html/r9e867429b44a694a77ff62351a0e92a4ca06c3f0184b8d9d5019c517%40%3Cdev.druid.apache.org%3E for a dev mailing list discussion that happened earlier this year. Also check out the associated to that thread PR #11314 for an alternative approach that leans in heavy to Hadoop 3 and uses the new shaded client jars instead of the invasive Hadoop dependency set Druid is currently using.

The main problem seems to be that there are still quite a few Hadoop 2.x clusters in the wild, which complicates how we should do this upgrade without breaking Druid for people in that environment. The best would probably be if we could come up with some way to support both Hadoop 2.x and 3.x, and also be free of the hadoop dependency problem, but it doesn't seem to be a trivial amount of effort.

Thank you for your reply. I just hope that those who use Hadoop3 version can use it directly without taking time to upgrade.

@clintropolis
Copy link
Member

Thank you for your reply. I just hope that those who use Hadoop3 version can use it directly without taking time to upgrade.

Yeah, hopefully at least people who are using Hadoop 3 today can find this PR as a useful reference to do a custom build until we decide what to do at a project level

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants