Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing Docker image to Docker Hub #1178

Open
kwkbtr opened this issue Aug 1, 2019 · 20 comments
Open

Publishing Docker image to Docker Hub #1178

kwkbtr opened this issue Aug 1, 2019 · 20 comments

Comments

@kwkbtr
Copy link

kwkbtr commented Aug 1, 2019

Hi, do you have any plan for publishing a Docker image built with release/Dockerfile to Docker Hub?
There may be a problem mentioned in #1008 (comment)

Sadly DockerHub (for automated builds that @bfirsh suggested above) does not seem to support build arguments, so it is non-trivial to automate builds for both variants of the image.

but having at least an image with default build arguments (ARG WITH_TEXLIVE="yes") would be a great help for us.

@dginev
Copy link
Collaborator

dginev commented Aug 1, 2019

Hi @kwkbtr . Currently we have @tkw1536 managing latexml's dockerization, so he could provide more context / brainstorm useful new images.

Have you seen the ones already created by Tom at:
https://hub.docker.com/r/latexml/latexml-test-runtime/

which we use for Travis?

@kwkbtr
Copy link
Author

kwkbtr commented Aug 1, 2019

Thank you for your suggestion.
I noticed those images but did not look into them closely since I was not sure they are suitable for usual use cases, not for testing.
I will give them a try.

@kwkbtr
Copy link
Author

kwkbtr commented Aug 1, 2019

I had a look at latexml/latexml-test-runtime and noticed that tags of the images do not include version number of LaTeXML.
It would be great if a specific LaTeXML version can be specified via image tag.

@tkw1536
Copy link
Contributor

tkw1536 commented Aug 1, 2019

I have considered publishing images to dockerhub, however using dockerhub auto-builds is difficult, because the Dockerfile is in the release subfolder whereas the build context needed is in the repository root. This is also seen in the Dockerfile itself:

# This Dockerfile expects the root directory of LaTeXML as a build context. 
# To achieve this run the following command from the root directory:
#
# > docker build -t latexml -f release/Dockerfile .

I can imagine three solutions to this:

  • We push images manually, which might result in a lot of work
  • We have Travis CI build the images and automatically push them to Dockerhub, but that would require some work to set up properly
  • We move the Dockerfile to the root of the repository (I vaguely remember this was the plan at some point, but there were some objections by @brucemiller)

@dginev
Copy link
Collaborator

dginev commented Aug 1, 2019

On the high level, I think we need a general approach similar to having a mini-team of maintainers that manage the Debian and Fedora package for latexml. I think Bruce only manages the macports route.

Having an up-to-date and functional collection of docker images strikes me as a similar maintenance burden. We would likely need a volunteer to at least prepare images for the named releases.

Also linking to the current hits for latexml on dockerhub, maybe we could recruit one of their authors as a volunteer, e.g. @physikerwelt ?

https://hub.docker.com/search?q=latexml&type=image

@tkw1536
Copy link
Contributor

tkw1536 commented Aug 2, 2019

I'll happily volunteer as maintainer of the DockerHub images, if we can figure out:

  • what images we want (with / without TexLive)
  • when do we want them updated (Daily? Weekly? Monthly? On Release?)

@dginev
Copy link
Collaborator

dginev commented Aug 2, 2019

Awesome, thanks Tom!

Personally I see a point for having release-based docker images (e.g. we can make one for each of 0.8.2, 0.8.3, 0.8.4 and then continue at each release point), as well as a single image that tracks master -- which is the bit that would have to be done automatically through Travis. That setup should take care of all reasonable use cases. Curious to hear if that would work for @kwkbtr as well?

@bfirsh
Copy link
Contributor

bfirsh commented Aug 2, 2019

👍 That's what we do for engrafo. Git tags turn into image tags for releases, and latest tracks master. We also push sha hash images for every commit, for the hell of it.

It's built on Travis so we can speed up builds by pulling and using --cache-from. That might be unnecessary for LaTeXML, so building on Docker Hub would work fine if you don't care about build speed.

https://github.com/arxiv-vanity/engrafo/blob/master/.travis.yml
https://github.com/arxiv-vanity/engrafo/blob/master/script/ci-deploy-master
https://github.com/arxiv-vanity/engrafo/blob/master/script/ci-deploy-tag

@kwkbtr
Copy link
Author

kwkbtr commented Aug 2, 2019

That setup should take care of all reasonable use cases. Curious to hear if that would work for @kwkbtr as well?

Yes, that should work great for my current use case.
Thank you all for your consideration! 👍

@dginev
Copy link
Collaborator

dginev commented Aug 2, 2019

Thanks @bfirsh , that's quite helpful!

@tkw1536
Copy link
Contributor

tkw1536 commented Aug 4, 2019

I've made a PR that adds support for DockerHub auto builds: #1181

@dginev
Copy link
Collaborator

dginev commented Feb 4, 2021

In the absence of official resolution for maintaining a docker image (I think it is not on anyone's critical path?), I ended up sidelining this issue and creating a new Dockerfile for a multi-threaded harness project that converts large collections of mathematical formulas -- which is a typical use of latexml for the Math Information Retrieval community (e.g. ARQMath is using latexml in 2020-2021).

"sidelining" in the sense that I couldn't do a

FROM latexml:latest

to base my image on. So instead I based it on the latest rust image (the proglang for the harness), and did the entire latexml installation dance through apt and cpanminus. Linking the Dockerfile here for reference, note that this is still experimental:
https://github.com/dginev/latexml-runner/blob/main/Dockerfile

Would be nice to circle back and tidy up the Docker toolchain pieces... so, bump ?

@tkw1536
Copy link
Contributor

tkw1536 commented Feb 4, 2021

From my end the Dockerfile in this repository still works. The only thing outstanding is that it should be published on some registry (e.g. DockerHub, GitHub Package Registry).

@dginev
Copy link
Collaborator

dginev commented Aug 9, 2021

I'm bumping the milestone again, since it's hard for us to get into the right mindset to organize and actively maintain these. It's a bit of a paradox that while everyone wants to have an official and properly updating "dockerized latexml" available, no one has the right motivation to actually execute on that.

The latexml dockerhub namespace lacks people who actively use a dockerized vanilla latexml, so it's almost like we're squatting on that namespace handle. Tom has been great in updating the CI images regularly, but he doesn't do actual latexml-at-scale conversions, so it's a different focus. Meanwhile, I do latexml-at-scale conversions, but with my own home-baked docker image that does a lot more than a vanilla latexml image would. So the whole thing is a bit sideways... We ought to straighten it out.

@dginev dginev modified the milestones: LaTeXML-0.8.7, LaTeXML-0.8.8 Jul 29, 2022
@dginev
Copy link
Collaborator

dginev commented Aug 2, 2022

While/Since we still don't have a resolution on how to maintain an official latexml image, I have published another unofficial one today, again installing latexml from scratch (in one of the many possible ways, this time using cpanminus, following the LaTeXML-Plugin-Cortex Dockerfile).

It is available under latexml/ar5ivist on Dockerhub, and the respective repository here. As the name suggests, it is a turnkey one-liner for conversions using the exact configuration for ar5iv.

@tkw1536
Copy link
Contributor

tkw1536 commented Aug 5, 2022

I think we should use this to restart the discussion of having an official docker image or not.

@castedo
Copy link

castedo commented Mar 20, 2024

@dginev FYI, I'm experimenting with an automatically built and publicly available OCI (docker) image with latexml over on gitlab. I'm planning to put it over at https://gitlab.com/perm.pub/dock. Feel free to shoot me questions and requests for better documentation, how I build it and why, etc... Enjoy!

@dginev
Copy link
Collaborator

dginev commented Mar 20, 2024

@castedo thank you for the heads up!

You are most welcome to edit your comment above and describe the full details of your use case, both in executing latexml, and in the way you've decided to package and publish that setup. I think it can be informative for everyone tracking this issue to know of such recent developments.

@castedo
Copy link

castedo commented Mar 21, 2024

Here's the dual Git & OCI container image registry which currently has LaTeXML 0.8.8, with some documentation on how to run it:
https://gitlab.com/perm.pub/dock/latexml-deb

For more details and documentation on how container image gets automatically built and deployed checkout:
https://gitlab.com/perm.pub/dock/

I'm using it to investigate what kind of JATS XML gets output by latexml+latexmlpost.

@physikerwelt
Copy link
Contributor

I have been using this docker file in production for several years. https://hub.docker.com/r/physikerwelt/latexml/tags It is a bit memory-hungry unless you restrict it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants