Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take advantage of multi-stage builds and modular JDK images #180

Closed
gunnarmorling opened this issue Nov 5, 2018 · 11 comments
Closed

Take advantage of multi-stage builds and modular JDK images #180

gunnarmorling opened this issue Nov 5, 2018 · 11 comments

Comments

@gunnarmorling
Copy link

Hey @rhuss, over the weekend I've been musing how the "ideal" Dockerfile for a Java app would look like. My requirements essentially are these:

  • Small base image and small result image; I'm not counting beans here, so e.g. I prefer the Fedora/CentOS/RHEL minimal image with proper glibc over Alpine with musl; But I don't want multiple hundred MB either
  • Avoidance of unneccessary re-building image layers

You can find the Dockerfile I came up with here: debezium/debezium-examples#42.

It's doing three things:

  • One stage creates a minimal JDK with just the required modules
  • One stage builds the project and adds the dependencies
  • The final stage assembles all required files

The first and second stages do only "heavy" work if needed, i.e. if the POM has changed (for the dependencies that's done automatically by virtue of leveraging mvn dependency:gooffline, for the latter it's a manual process). It's using the Fedora minimal image as base for the resulting image. So the size and rebuilding penalties are quite neat:

  • ~180 MB overall, out of which only a very thin layer must be rebuilt upon actual code changes. Lower layers (dependencies, JDK) must only be rebuilt if actually needed. Also the Maven repo is not part of the resulting image itself, still no re-downloads of all the Maven deps are needed, as the repo lives in an (untagged) image produced by the earlier build stage.
  • Only the last image layer containing the actual application JAR must be re-built most of the times, resulting in very quick turn-around times in terms of building and redistributing the image.

So my question is: can we achieve something comparable with S2I? If so, how? If not, could we make it happen? Let me know what you think, looking foward to hearing from you :)

@vorburger
Copy link
Collaborator

@gunnarmorling I think you're raising a number of (interesting) points here! 😄 Re. modularity, #181 ?

@rhuss
Copy link
Contributor

rhuss commented Nov 5, 2018

Yeah, multistage Docker builds are very helpful but as you guess, they are not really suitable for the way how S2I works. The same is true for JIB which use also a layered approach for resources, dependencies and project classes. This works even completely without Docker daemon and has very good performance characteristics.

S2I answer for reusing already download dependencies are 'incremental' builds. This works by copying in over some parts from a previous build into the current image. The S2I images here take care that the .m2 directory are reused in an incremental build. However, in early tests (maybe two years ago?) it turned out that the copy over the Maven repo took as nearly as long as downloading the deps afresh. Maybe its worth to revisit incremental builds again ?

Finally, @nicolaferraro and @lburgazzoli are working on a sophisticating caching of various builder images with deps included which they use in Camel K for super fast build times for similar builds. I don't know the details in depth, but maybe we could leverage that here as well for a generic s2i approach ?

Also, an interesting article is https://blog.sonatype.com/improving-build-time-of-java-builds-on-openshift where Jorge is showing some other strategies to increase build performance of Java S2I build.

@gunnarmorling
Copy link
Author

Yeah, multistage Docker builds are very helpful but as you guess, they are not really suitable for the way how S2I works.

Admittedly that makes me wonder whether S2I isn't then getting more in the way than that it helps ;) Multi-stage builds are super-useful, so if a tool is blocking us from using them, it's a bit of pity.

S2I answer for reusing already download dependencies are 'incremental' builds.

I had looked into that, but the problem is that incremental builds drastically increase the resulting image size (the Maven repo gets added). That's bothering, esp. as the Maven repo is a build-only artifact which isn't required at application runtime. That's exactly the beauty of the multi-stage builds: they allow us to keep apart the artifacts of the different lifecycle phases (build vs. runtime).

Finally, @nicolaferraro and @lburgazzoli are working on a sophisticating caching of various builder images with deps included

But what is there to work on, if multi-stage builds already provide a very practical solution? Or is it about integrating multi-stage builds into S2I?

Also, an interesting article...

I had a quick look, but instead of throwing more tools (Nexus) at the issue I'm wondering whether we can't come up with something more simple.

In fact, I'm curious why I should even use S2I instead of that rather simple Dockerfile. Which btw. I also can use with local testing in Docker, Compose etc. One reason surely is that the Docker strategy isn't available with OS Online, hence I'm so eager whether we can adjust the S2I process to take advantage of all this.

Sorry should I sound a bit negative, it's just that I think there's so much potential to make things smoother here for folks out there, so let's do it :)

@rhuss
Copy link
Contributor

rhuss commented Nov 5, 2018

Admittedly that makes me wonder whether S2I isn't then getting more in the way than that it helps ;) Multi-stage builds are super-useful, so if a tool is blocking us from using them, it's a bit of pity.

Tbh, I'm not sure whether multi stage builds are included as a kind of standard, so not sure that other buildsystems using Dockerfiles as their definition format also support multi stage builds. 'would be interesting to check out whether multi stage builds are usable outside the Docker universe.

@rhuss
Copy link
Contributor

rhuss commented Nov 5, 2018

But what is there to work on, if multi-stage builds already provide a very practical solution? Or is it about integrating multi-stage builds into S2I?

As mentioned above, I suspect that multi-stage builds require a Docker daemon and can't be created with other OCI compliant build systems. As the Kubernetes ecosystem is moving away from Docker to introduce a Docker-only feature would be a blocker IMO. Also, e.g. Minishift and the latest supported Docker daemon for OpenShift don't even support multi-stage builds yet.

@rhuss
Copy link
Contributor

rhuss commented Nov 5, 2018

In fact, I'm curious why I should even use S2I instead of that rather simple Dockerfile. Which btw. I also can use with local testing in Docker, Compose etc. One reason surely is that the Docker strategy isn't available with OS Online, hence I'm so eager whether we can adjust the S2I process to take advantage of all this.

If you want to use OpenShift ImageStreams and build with OpenShift you have to use S2I for now, but depending on the momentum on knative-build, there might be soon alternatives.

@rhuss
Copy link
Contributor

rhuss commented Nov 5, 2018

Sorry should I sound a bit negative, it's just that I think there's so much potential to make things smoother here for folks out there, so let's do it :)

No problem ;-) I don't think Docker multi-stage builds are technical a bad thing, its just that I see Docker support in platforms like Kubernetes or OpenShift on the decline (and as mentioned multi-stage support even never made it into the docker daemon used by OpenShift, and I suspect it will arrive in OpenShift land).

@gunnarmorling
Copy link
Author

Gasp, I wasn't aware that the Docker daemon in OS doesn't support Dockerfiles with multi-stage builds. Thanks for pointing it out, @rhuss.

Regarding multi-stage builds themselves, I don't think they are inherently bound to the Docker deamon, at least in theory you could apply the same pattern with other image builders, too. After all, you're just taking the output created by one build as input for another. In fact, the other day I learned about OpenShift's chained builds, which look like that. So this might actually be the answer, I'll try and see whether I can build a complete example with the Java S2I builder.

On knative-build, I'm monitoring that, too. Might indeed be a viable alternative some time soon.

@vorburger
Copy link
Collaborator

@gunnarmorling @rhuss just FYI re this old discussion here, I learnt today from @siamaksade over in quarkusio/quarkus#304 about OpenShift Chained builds. You may know about this already (I somehow missed this previously), but if you don't, have a look, it would let you something quite like Docker multi stage build today.

As for things a little bit more in the future: I believe we are moving from the old version of Docker used in today's OpenShift to podman with buildah, which supports multi stage in a Dockerfile; I've used this St and alone already, it will eventually find its way into OpenShift.

And Knative build has build steps. I may or may not be able to get more into that in the coming weeks.

@rhuss
Copy link
Contributor

rhuss commented Mar 14, 2019

@vorburger yeah, I know the chained S2I builds (and its even described in the "Image Builder" pattern in out "Kubernetes Pattern" book, see k8spatterns.io ;-)

You find the example from the book here: https://github.com/k8spatterns/examples/tree/master/advanced/ImageBuilder/openshift

@vorburger
Copy link
Collaborator

This issue was more of a conceptual discussion about something which this project can't really "deliver" or "fix".

https://quarkus.io/guides/openshift-s2i-guide documents an example how to set up a chained build.

@gunnarmorling @rhuss let me therefore close this old issue, to clean up this project a little bit - hope OK for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants