Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Dependencies relative path under pipeline file properties #1706

Closed
morningcloud opened this issue May 27, 2021 · 11 comments
Closed

File Dependencies relative path under pipeline file properties #1706

morningcloud opened this issue May 27, 2021 · 11 comments
Labels
component:pipeline-editor pipeline editor component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines feedback:user
Milestone

Comments

@morningcloud
Copy link

Describe the issue
My project contains folder structure as follows:

/servicelib/controllers/
/servicelib/models/
/pipelines/main/
/pipelines/experiment/

I have notebooks and related pipelines under /pipelines/main/ and some python libraries used by the notebooks under /servicelib/. This is why I added a relative path to the parent folder as ../../servicelib/ under File Dependency as part of the notebook properties in the pipeline. This runs successfully when running the pipeline in-place locally, but when running it using the kubflow runtime the following error occurs:

Traceback (most recent call last):
  File "/src/conda/envs/notebook/lib/python3.7/site-packages/tornado/web.py", line 1704, in _execute
    result = await result
  File "/src/conda/envs/notebook/lib/python3.7/site-packages/elyra/pipeline/handlers.py", line 89, in post
    response = await PipelineProcessorManager.instance().process(pipeline)
  File "/src/conda/envs/notebook/lib/python3.7/site-packages/elyra/pipeline/processor.py", line 70, in process
    res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
  File "/src/conda/envs/notebook/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/src/conda/envs/notebook/lib/python3.7/site-packages/elyra/pipeline/processor_kfp.py", line 69, in process
    format(pipeline_name, pipeline_path), str(ex)) from ex
RuntimeError: ('Error compiling pipeline reftest-0527064628 at /tmp/tmpavdhj1ol/reftest-0527064628.tar.gz', "Node 'reftest' referenced dependencies that were not found: {'../../servicelib/controllers/*.py', '../../servicelib/services/*.py', '../../servicelib/models/*.py'}")

Screenshot of the properties window
image

Expected behavior
Expected to be able to run the pipeline successfully with File Dependencies copied from relative parent path

Pipeline runtime environment
If the issue is related to pipeline execution, identify the environment where the pipeline is executed

  • Kubeflow Pipelines
@morningcloud morningcloud added component:pipeline-editor pipeline editor component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines feedback:user status:Needs Triage labels May 27, 2021
@ptitzler ptitzler added kind:bug Something isn't working and removed kind:bug Something isn't working labels May 27, 2021
@ptitzler
Copy link
Member

I confirmed the behavior but believe that the Elyra packaging process does not support the kind of directory layout you are using. Can you please clarify how the notebooks are referencing the Python scripts?

@ptitzler
Copy link
Member

To illustrate the [directory] problem, assume the following directory layout:

mypipeline.pipeline
mynotebook.ipynb
sub-dir/my-python-dependency.py

The produced input artifacts archive contains the following (/ denotes the root directory of the archive, not the root directory!):

/mynotebook.ipynb
/sub-dir/my-python-dependency.py

In this context, ../ in mynotebook.ipynb would therefore refer to something that conceptually does not exist when the notebook is executed in KFP/AA.

@ptitzler
Copy link
Member

A potential workaround, although it's not ideal and I haven't tested it yet, would be to define symlinks in the directory where the pipeline file/notebook is located. Support for symlinks was only recently added by #1689, which is not yet included in any stable release.

@ptitzler
Copy link
Member

ptitzler commented May 27, 2021

Confirmed that with #1689 in place the following would work:

/servicelib/controllers/
/servicelib/controllers/a-file.py
...
/pipelines/main/mypipeline.pipeline
/pipelines/main/mynotebook.ipynb   << references symlink instead of symlink source
/pipelines/main/controllers        << directory symlink for ../../servicelib/controllers/
/pipelines/main/a-link.py          << file symlink to ../../servicelib/controllers/a-file.py
...

@morningcloud
Copy link
Author

Adding symlink does look like a logical workaround to avoid having to change the folder structure to have the pipeline files in the root directory. We put all pipeline files in a different subdirectory for organisation sake.
Indeed it would be good to have best practices documented on how to structure development projects.

@lresende
Copy link
Member

And just to clarify things, the actual pipeline files can be anywhere. The requirement we have is relative to the notebook, where its dependencies need to be contained as children of the same directory, otherwise, we can't reconstruct the folder hierarchy on the root of the container during runtime execution.

@ptitzler
Copy link
Member

To summarize:

Please let me know if I missed anything.

@morningcloud
Copy link
Author

Thanks for the summary. This covers everything 👍🏻

@pacospace
Copy link
Contributor

To summarize:

Please let me know if I missed anything.

Thanks @ptitzler, what is ETA for 2.3 release stable?

@lresende
Copy link
Member

@pacospace 2.3 will now be called 3.0 and you can see the current status at the release milestone but my feeling is that we still need about a couple of weeks to get it stabilized and released.

@pacospace
Copy link
Contributor

@pacospace 2.3 will now be called 3.0 and you can see the current status at the release milestone but my feeling is that we still need about a couple of weeks to get it stabilized and released.

Thank you very much for the update @lresende! I will keep an eye on that! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:pipeline-editor pipeline editor component:pipeline-runtime issues related to pipeline runtimes e.g. kubeflow pipelines feedback:user
Projects
None yet
Development

No branches or pull requests

4 participants