Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use case: Allow for jupyter or interactive notebooks via jobset #133

Closed
kannon92 opened this issue May 9, 2023 · 14 comments
Closed

Use case: Allow for jupyter or interactive notebooks via jobset #133

kannon92 opened this issue May 9, 2023 · 14 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@kannon92
Copy link
Contributor

kannon92 commented May 9, 2023

Interactive jobs can be a very common usecase in HPC and/or DS/ML.

I wanted to play around with using jobsets and I have the following example which creates a jupyter pod and a service. However, maybe I don't understand headless services, but I am unable to access this service via a port-forward.

My JS job:

apiVersion: jobset.x-k8s.io/v1alpha1
kind: JobSet
metadata:
  name: jupyter
spec:
  suspend: false
  replicatedJobs:
  - name: rj
    network:
      enableDNSHostnames: true
    template:
      spec:
        backoffLimit: 0
        template:
          spec:
            containers:
            - name: jupyterlab
              imagePullPolicy: IfNotPresent
              image: jupyter/tensorflow-notebook:latest
              securityContext:
                runAsUser: 1000
              resources:
                limits:
                  memory: 1Gi
                  cpu: 1
                requests:
                  memory: 1Gi
                  cpu: 1
              ports:
                - containerPort: 8888
                  name: jupyterlab
              env:
              - name: JUPYTER_TOKEN
                value: testing

I can create an example usecase with a Job and a service manually. Is it in scope to allow for jobset to support something like this?

apiVersion: batch/v1
kind: Job
metadata:
 name: jupyter-job
spec:
 backoffLimit: 0
 template:
   metadata:
     labels:
       app: jupyter
   spec:
     restartPolicy: Never
     containers:
     - name: jupyterlab
       imagePullPolicy: IfNotPresent
       image: jupyter/tensorflow-notebook:latest
       securityContext:
         runAsUser: 1000
       resources:
         limits:
           memory: 1Gi
           cpu: 1
         requests:
           memory: 1Gi
           cpu: 1
       ports:
       - containerPort: 8888
         name: jupyterlab
       env:
       - name: JUPYTER_TOKEN
         value: testing
---
apiVersion: v1
kind: Service
metadata:
 name: jupyter-service
spec:
 selector:
   app: jupyter
 ports:
   - protocol: TCP
     port: 8888
     targetPort: 8888
@kannon92
Copy link
Contributor Author

@danielvegamyhre or @ahg-g any thoughts on this?

@vsoch
Copy link
Contributor

vsoch commented Jun 4, 2023

@kannon92 I tried this out - and it works OK for me. Could it be something about your development environment? I was using kind on ubuntu 22.04. Here is what I did: https://github.com/researchapps/jobset-jupyter

@vsoch
Copy link
Contributor

vsoch commented Jun 4, 2023

But I don't think this uses anything with the headless service? I just turned it off, and the notebook still forwards. I think it would mostly be important if different things within a jobset (e.g., more than one pod) needed to communicate. For a one off pod serving something on a single port, we don't need JobSet / the extra DNS.

@kannon92
Copy link
Contributor Author

kannon92 commented Jun 5, 2023

Thanks @vsoch! I didn’t realize I could use the pod. 🙃

so in our usecase we would want to eventually use an ingress without requiring users to port-forward a pod. Is it possible to use a headless service with an ingress?

@vsoch
Copy link
Contributor

vsoch commented Jun 5, 2023

I'm not sure - I haven't used ingress much because the setup is quite extensive. Do you want to try it?

@ahg-g
Copy link
Contributor

ahg-g commented Jun 9, 2023

o play around with using jobsets and I have the following example which creates a jupyter pod and a service. However, maybe I don't understand headless services, but I am unable to access this service via a port-forwar

Sorry, I missed this.

The headless service that JobSet creates doesn't create a virtual IP, what it does is trigger creating of the dns records of the pods so that they can be reached using their hostnames. If you notice, the API we have in JobSet is enableDNSHostNames and the headless service is just the implementation detail.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2024
@vsoch
Copy link
Contributor

vsoch commented Jan 22, 2024

@kannon92 do you want to try anything else? I'm happy to help (with slightly more experience than the first time we chat)!

@ahg-g
Copy link
Contributor

ahg-g commented Jan 22, 2024

I don't think there is anything to be done here on the JobSet side. It would be nice to have an example in the repo though.

@vsoch
Copy link
Contributor

vsoch commented Jan 22, 2024

okay - I updated my example to use a newer (and now pinned) version of JobSet, changed the port-forward to ingress proper, and added the appropriate configs. I'm good to close here if @kannon92 is. We don't have a CLA yet so I can't contribute formally from my work but would be happy to if/when we get that!

@kannon92
Copy link
Contributor Author

This sounds good! Happy to leave it open as a reminder to you @vsoch for the example once you are able to contribute.

Or we can close this for now.

@vsoch
Copy link
Contributor

vsoch commented Jan 22, 2024

Either works for me - I'm not able to give a good time table for when the lab will have the CLA signed. It's been a hot minute so far. 😆

@vsoch
Copy link
Contributor

vsoch commented Jan 22, 2024

Regardless of the issue I'll add this to my master TODO as a reminder for me.

@kannon92
Copy link
Contributor Author

TODO sounds good then! Thanks for looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants