Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to launch a binder directly from a dockerhub image (bypass repo2docker completely) #1298

Open
Tracked by #46
rabernat opened this issue May 14, 2021 · 13 comments

Comments

@rabernat
Copy link

Proposed change

In Pangeo, we use CI to build complex docker images with our full stack in https://github.com/pangeo-data/pangeo-docker-images. These images are used directly in various Pangeo JupyterHubs.

We also want to use the same images in binder. We nearly always use use the nbgitpuller trick to use separate repos for the binder env and contents. Currently this requires making a "passthrough" repo with a single-line Dockerfile pointing at the desired image on Dockerhub, e.g.: https://github.com/pangeo-gallery/default-binder/blob/master/binder/Dockerfile

Maintaining this "passthrough" repo is an extra step that leads to unnecessary complexity and also wastes binder resources rebuilding docker containers that are unchanged from the dockerhub version.

I would love to have an option to launch a binder directly from a dockerhub (or other container registry) image, completely bypassing repo2docker.

Alternative options

Just keep doing what we are doing now, which works fine but requires additional complexity.

Who would use this feature?

Pangeo Gallery and the entire Pangeo project would use this feature heavily. More generally, this feature would help bridge the gap between cloud-based JupyterHubs using pre-built docker images and Binders, improving interoperability between environments. It would make it trivial to launch a binder with an identical environment to a cloud-based JupyterHub, without requiring users to mess around with Dockerfiles.

(Optional): Suggest a solution

I don't know enough about how binderhub works to propose an implementation.

@welcome
Copy link

welcome bot commented May 14, 2021

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@yuvipanda
Copy link
Collaborator

We probably need to add some code here to check if the image exists in the registry and mark it as found if so.

@scottyhq
Copy link

I brought this up in a separate thread jupyterhub/mybinder.org-deploy#1474 (comment), probably should opened a discourse forum post but never got around to it. In any case, wanted to link to some relevant discussion from the past.

@betatim
Copy link
Member

betatim commented May 17, 2021

What about having something like binderhub.example.org/v2/dockerhub/docker-org/image/tag as launch URLs for this kind of thing?

@manics
Copy link
Member

manics commented May 17, 2021

What about having something like binderhub.example.org/v2/dockerhub/docker-org/image/tag as launch URLs for this kind of thing?

Conceptually this sounds like a new content-provider, especially as tag can be the equivalent of a git branch that's updated so you';d need to decide whether or not to check for an updated image. For efficiency it'd be nice to bypass repodocker, but as a proof-of-concept having repo2docker pull and push the image might be feasible?

Do you think it should be hardcoded to dockerhub, or something like docker with support for all public docker registries?

@yuvipanda
Copy link
Collaborator

Do you think it should be hardcoded to dockerhub, or something like docker with support for all public docker registries?

Definitely support most public registries I think. Should have same features as passing it to docker pull.

so you';d need to decide whether or not to check for an updated image.

Where would we store this information? I was instead thinking we'll pass it through to kubernetes and set imagePullPolicy to Always - which will do the 'right thing' I think.

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/embed-binder-related-metadata-in-notebook/10329/1

@meeseeksmachine
Copy link

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/use-published-docker-image-for-binder/10333/3

@cisaacstern
Copy link

Just a quick 👍 for this feature. In the Pangeo Forge sandbox we define a custom image for Binder, to provide new users a pre-built environment to experiment with our tools. IIUC, if we could pull pre-built images from Docker Hub, it's reasonable to expect that these environments would load faster (thereby reducing friction for new users). Thanks to everyone working on this feature!

@betolink
Copy link

This would be super useful to have, any status update on this issue?

@yuvipanda
Copy link
Collaborator

@betolink nobody has had any time to work on this yet :(

@ctr26
Copy link
Contributor

ctr26 commented Oct 28, 2022

This would be really useful for me too

@manics
Copy link
Member

manics commented Oct 30, 2022

A quick way of implementing this would be to ignore BinderHub, and customise KubeSpawner to show a form where a user can enter the required container registry image (and/or a dropdown of exisitng images, perhaps dynamically generated). If you're following the nbgitpuller trick you could even include the contents URL as a form parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants