Skip to content

Use JDK 11 by default #427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 8, 2022
Merged

Use JDK 11 by default #427

merged 7 commits into from
Mar 8, 2022

Conversation

karllessard
Copy link
Collaborator

As it has been discussed with the SIG, minimal supported JDK version for TF Java will moved from Java8 to Java11. This PR enables JDK11 releases by default, and allows users to build with JDK17 by activating the jdk17 Maven profile.

@karllessard
Copy link
Collaborator Author

Ok, I think I now got a configuration that allows us to make all the different frameworks we are using happy and in harmony on JDK17 (some spotless checks still fail but I'll push a separate commit at the end for this).

Now, if everything works as expected, the question is: do we really want to merge this? I'm raising the concern that if we decide to migrate to JDK11 by default and only distribute binaries for this version, any other application or framework that using our API will be forced to migrate to this version as well.

We all know that JDK8 is very old and that it's time for Java users to mourn and move forward. Also, since TensorFlow Java has not been "officially" released yet (we are still in alpha mode), we do not need to guarantee forever backward compatibility with Java 8. On the other hand, TensorFlow Java sits very low as a software layer (right above the native code) and therefore its migration will impact a larger audience. Are all our users ready for this migration?

Any thoughts?

@karllessard karllessard added the CI build Triggers a full native build on a pull request label Mar 3, 2022
@saudet
Copy link
Contributor

saudet commented Mar 3, 2022

If the point of moving to JDK11 is to be able to use new features, I think those are most useful in higher-level API like tensorflow-framework, so what about leaving tensorflow-core on JDK8? We can have different modules use different versions of Java. That way we can still do all the basic things that most applications need to do with JDK8.

@karllessard
Copy link
Collaborator Author

We could benefit using JDK11 for core things as well, like Cleaners for example... I've just recalled that when SIG JVM was created and that we moved the Java code of TF into the new repo that at that time it was supporting Java7 for Android apps and we've decided "Ok that's enough" and we finally could start to write some lambdas that eventually unblocked a few features, like concrete functions.

So the core will have to eventually migrate to something higher than JDK8... so the question is maybe more "when" should it do it?

@saudet
Copy link
Contributor

saudet commented Mar 4, 2022

Well, as you know, I don't think that using Cleaner for anything is a good idea, so... Anything else? Anyway, now that the idiotic lawsuits are over, Google can restart adding more Java features to Android, and they've added Cleaner, so it's not like it works only with JDK8 features: https://developer.android.com/reference/java/lang/ref/Cleaner

@karllessard
Copy link
Collaborator Author

Yeah, I just gave Cleaners as an example of interesting API to try out but I don't expect miracles from it neither. I think most of the interesting stuff for us is in JDK17 and beyond (e.g. the foreign memory access API) and moving to JDK11 would be just a step closer to it (but without reaching it)...

@JimClarke5 , @Craigacp , I know we've been talking about this during our last SIG session but can you also share your point of view in this thread?

@Craigacp
Copy link
Collaborator

Craigacp commented Mar 4, 2022

My opinion is that there isn't a particularly strong pull to 11, as while there are useful language features for users (var) and a bunch of things in the libraries that make file IO & collections nicer, I think it's only really Cleaners and maybe VarHandles that will make things easier for us inside TF core/ndarray. However I think we should start moving off an 8 year old version of Java. The ecosystem is moving off 8 (even Android is moving to 11), and the new versions of Spring among others are moving directly to 17. I think by the end of the year that only supporting 11 or newer will not be a big deal, however we should talk to the SparkNLP people before doing it, as Spark lags the ecosystem due to a dependence on older Scala versions and the difficulty in upgrading anything in the Hadoop ecosystem.

@maziyarpanahi
Copy link

Hi @karllessard and @Craigacp

Thanks for having this discussion here. For Spark NLP, whenever it comes to deprecating something (Scala, Java, Spark, etc.) in favour of a newer software for more features, maximizing performance, maintainability, security, or even ease of development we have to look at a few things first:

  • Users demanding the newer technology
  • Industry adaptation (managed services, platforms, etc.)
  • Apache Spark (EOL, ecosystem, third-party software, etc.)

Most of the time, the result is against our wishes. We would love to move entirely to Scala 2.13 on Java 11 today, however, the industry moves very slow. As you can see even the latest runtimes/versions of Databricks, AWS/EMR, even GCP are still supporting Java 8 without giving any EOL for it (at least for Hadoop/Spark clusters). Apache Spark ecosystem itself still supports Java 8 in addition to 11, Databricks had to extend end-of-life for Spark 2.4.x (only uses Java 8) twice from last year to this year.

My humble suggestion:

  • If possible, support both Java 8 and 11 (or only 8) until the end of 2022 and then move to Java 11 if there is not a blocker in tensorflow-java that can be only resolved by moving entirely to Java 11. If it's only to take advantage of new features in Java 11 and be ready for 17, I think the end of 2022 would be a good start.

Spark NLP itself cannot stop supporting Java 8 until the end of 2022, but even then we have to wait until Databricks, AWS, Google, and others stop Java 8 support entirely.

@karllessard
Copy link
Collaborator Author

Thanks @maziyarpanahi , I also think that while we are all eager to jump on most recent versions of the JDK, this migration must start from the end users, and buying time until end of 2022 seems also reasonable to me.

So, TFJava speaking, we have many options on the table:

  1. Stay on JDK8 and revisit that decision at the end of 2022
  2. Jump on JDK11 for TFJava future releases (0.5+) but continue to provide patches on the 0.4 branch to support JDK8
  3. Distribute artifacts for multiple JDK versions (multi-releases JARs)
  4. Migrate the TFJava Framework to JDK11 but remain on JDK8 in TFJava Core and NdArray library.

Personally I like option 2, it shifts some more maintenance work in our hands but we've already started to provide CVE patches on previous releases anyway. We would probably not merge new features in the 0.4 release branch though. Any thoughts?

@maziyarpanahi
Copy link

Thanks @karllessard for the understanding. I also personally very much like option 2, in addition to security patches some blocking bugs also can be updated/released in those branches as well, for those who need to wait before moving to a new major TF release and eliminate let's say CUDA 11.x in future.

@karllessard karllessard removed the CI build Triggers a full native build on a pull request label Mar 8, 2022
@karllessard karllessard merged commit 02ec490 into tensorflow:master Mar 8, 2022
@karllessard karllessard deleted the jdk11 branch March 8, 2022 16:12
@karllessard
Copy link
Collaborator Author

All right, that's what we'll do then, this is now merged for 0.5.x and future releases. Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants