[Feature Request]: Execute code independently of the IDE #216

RHRolun · 2023-10-19T07:51:49Z

Feature description

The current workbenches execute user code inside the same environment that the IDE is running.
This can in some cases be undesirable as the dependencies needed to run the environment may collide with the desired dependencies for the code execution, or provide results that differ from running on a slimmer environment in production.
The goal of this feature would be to separate the IDE environment from the code execution environment so that the dependencies do not get mixed and so that the code execution environment easily can be replaced by another with other dependencies.

Describe alternatives you've considered

Separate the IDE environment from the code execution environment, either through virtual environments (set default kernel in the notebooks) or through remote execution on a different pod.

Anything else?

No response

shalberd · 2023-10-19T09:17:32Z

mmh, isn't that the idea behind pipelines and dedicated / separate runtime containers for every step of a pipeline?

@guimou I think you have experience in this, too, I noticed once how you talked about Jupyter env dependencies.

RHRolun · 2023-10-19T09:23:18Z

@shalberd - yes, pipelines let you do this in a nice way, but having to go through a pipeline all the time while prototyping is quite a hassle.
This brings up another good point, if you wanted to develop a script for a specific step of the pipeline with specific dependencies, it would be great to quickly swap out your kernel/execution env and run it in the IDE without having to execute the pipeline.

andrewballantyne · 2023-10-19T12:57:44Z

cc @harshad16

lucferbux · 2023-10-23T13:06:06Z

/transfer kubeflow

guimou · 2023-10-23T13:44:49Z

This has always been the issue with the way our workbench images are built. Several aspects to that:

UBI images are built with an already existing Python venv (/opt/app-root). Everything Python that happens will be in this venv. The rational is that it won't prevent app/user packages to collide with the ones built inside the OS (there are some, for DNF and stuff. While the intent is good, it prevents from creating and using other venv.
Jupyter is a Python app. So it needs to run from somewhere... In local development mode, you will have as many Jupyter deployment as you have venvs. Meaning you switch to a specific venv (manually, with Anaconda, whatever...), then only launch Jupyter. There you can manage some consistency and compatibility. This is not doable in our containerized environment as Jupyter IS the UI.
If you modify currently loaded packages, then what happens of Jupyter? It's an egg-and-chicken problem that I have never fully investigated.
Working in a single fixed venv has advantages though, the first one being immutability/consistency. If you let people create multiple ones, you're back at square one in terms of being able to share notebooks and data as different people will have different venvs, more or less properly maintained or in sync.
Now, most if not all of our compatibility issues don't come from Jupyter itself, but from our extensions (Elyra, KFP,...), that are either awfully lagging in terms of dependencies, or have very strict fixed dependencies that prevent from installing something else alongside. I'm really close to simply ditch Elyra (or even Codeflare which had the same kinds of issues until recently) out of my custom images as it's a nightmare to have it work with some recent Python libraries...

Some possible paths from there:

Switch to VSCode for specific jobs... As it's not a Python-based UI you have less constraints in terms of compatibility, while still being able to work with notebooks. However you loose Elyra...
Investigate how to manually create and persist other kernels in a persistent volume. People would be able to create and populate those kernels with what they want, and Jupyter would execute them as "external" stuff, meaning not using the same Python installation as the one it's currently running on.
Have some kind of custom selector at the beginning of a sessions, so something before/in front of Jupyter that would allow to select a specific venv to run on. Somewhat similar to the Anaconda approach. However it's not that different from having different workbenches.
Update Elyra/KFP/Codeflare (and surely other extensions) to make sure they keep up with the rest of the world and don't yield to incompatibilities.
Don't include Elyra/KFP/Codeflare in all images, and have specific workbenches when you want to use those features. Definitely not ideal...

shalberd · 2023-10-23T20:46:40Z

Working in a single fixed venv has advantages though, the first one being immutability/consistency.

That is THE reason we in our corporation would always aim to have one container image mean one specific env, with clear dependencies and when developers want flexibility, we'd just build them another image, for which by the way @guimou had made a great modular folder structure and toolset (interactive-image-builder.sh) that makes the whole thing a breeze. With IDE, without IDE just for runtimes i.e. in Airflow or Kubeflow pipelines, and so on.

I have worked with Anaconda and other toolchains as well, so I know both perspectives, plus our data scientists used to working on their laptops locally gave us exactly that point of view initially mentioned here, but, there are clear advantages of doing it the immutable / always consistent per-container way.¨

I'm really close to simply ditch Elyra

Elyra is having issues with its community, I believe, plus trying to be too many things all at once, for all kinds of deployments, container PaaSs and so on.

RHRolun added kind/enhancement New feature or request priority/normal An issue with the product; fix when possible labels Oct 19, 2023

openshift-ai-project-manager bot added this to Internal tracking Oct 23, 2023

openshift-ci bot transferred this issue from opendatahub-io/odh-dashboard Oct 23, 2023

lucferbux moved this from Untriaged to Done in ODH Dashboard Planning Oct 23, 2023

harshad16 added this to ODH IDE Planning Oct 24, 2023

github-project-automation bot moved this to 📋 Backlog in ODH IDE Planning Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Execute code independently of the IDE #216

[Feature Request]: Execute code independently of the IDE #216

RHRolun commented Oct 19, 2023

shalberd commented Oct 19, 2023 •

edited

Loading

RHRolun commented Oct 19, 2023 •

edited

Loading

andrewballantyne commented Oct 19, 2023

lucferbux commented Oct 23, 2023

guimou commented Oct 23, 2023

shalberd commented Oct 23, 2023 •

edited

Loading

[Feature Request]: Execute code independently of the IDE #216

[Feature Request]: Execute code independently of the IDE #216

Comments

RHRolun commented Oct 19, 2023

Feature description

Describe alternatives you've considered

Anything else?

shalberd commented Oct 19, 2023 • edited Loading

RHRolun commented Oct 19, 2023 • edited Loading

andrewballantyne commented Oct 19, 2023

lucferbux commented Oct 23, 2023

guimou commented Oct 23, 2023

shalberd commented Oct 23, 2023 • edited Loading

shalberd commented Oct 19, 2023 •

edited

Loading

RHRolun commented Oct 19, 2023 •

edited

Loading

shalberd commented Oct 23, 2023 •

edited

Loading