Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add codeflare sdk to datascience base image #196

Closed

Conversation

dimakis
Copy link

@dimakis dimakis commented Sep 1, 2023

Closes #197

Description

The addition of the CodeFlare SDK to interact with codeflare stack

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@dimakis
Copy link
Author

dimakis commented Sep 1, 2023

@rkpattnaik780 hey bro! Can you have a look at this pr?

Copy link
Contributor

@rkpattnaik780 rkpattnaik780 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rkpattnaik780
Once this PR has been reviewed and has the lgtm label, please assign atheo89 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rkpattnaik780
Copy link
Contributor

/retest-required

@atheo89
Copy link
Member

atheo89 commented Sep 4, 2023

Hi @dimakis,

This PR looks good overall, but I have a couple of suggestions that are needed:

1. Consistency Across OS Flavors: It's important to maintain consistency among the different flavors of OS in our notebooks. To do that, please make sure to add the codeflare-sdk package to the ubi8 data science notebook as well.

2. Updating the rest Pipfiles: Additionally, don't forget to update the Pipfiles for the following notebooks: PyTorch, TensorFlow, and TrustyAI (As well as on the corresponding runtimes). You can refer to the example line for guidance on how to make these updates. This will ensure that the dependencies are correctly aligned across all relevant notebooks.

@dimakis
Copy link
Author

dimakis commented Sep 4, 2023

Hi @dimakis,

This PR looks good overall, but I have a couple of suggestions that are needed:

1. Consistency Across OS Flavors: It's important to maintain consistency among the different flavors of OS in our notebooks. To do that, please make sure to add the codeflare-sdk package to the ubi8 data science notebook as well.

2. Updating the rest Pipfiles: Additionally, don't forget to update the Pipfiles for the following notebooks: PyTorch, TensorFlow, and TrustyAI (As well as on the corresponding runtimes). You can refer to the example line for guidance on how to make these updates. This will ensure that the dependencies are correctly aligned across all relevant notebooks.

@atheo89, thank very much for your detailed reply. Due to a dependency on our end (which we hope to overcome soon) python 3.8 will not work with our stack, hence the inclusion in only 3.9.
I will update the others to include it though. Thanks again :)

@dimakis dimakis force-pushed the addition-of-codeflare-sdk branch from c79d237 to 3fcd893 Compare September 4, 2023 11:58
@openshift-ci openshift-ci bot removed the lgtm label Sep 4, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2023

New changes are detected. LGTM label has been removed.

@dimakis
Copy link
Author

dimakis commented Sep 4, 2023

@atheo89 I've managed to piplock env all but the tensorflow image with this error:

...
✘ Locking Failed!
⠹ Locking...
CRITICAL:pipenv.patched.pip._internal.resolution.resolvelib.factory:Could not find a version that satisfies the requirement tensorflow~=2.11.0 (from versions: 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.14.0rc0, 2.14.0rc1)
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 811, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 759, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 738, in resolve
[ResolutionFailure]:       return resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 1096, in resolve_deps
[ResolutionFailure]:       results, hashes, markers_lookup, resolver, skipped = actually_resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 895, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 684, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=str(e))
[pipenv.exceptions.ResolutionFailure]: Warning: Your dependencies could not be resolved. You likely have a mismatch in your sub-dependencies.
  You can use $ pipenv install --skip-lock to bypass this mechanism, then run $ pipenv graph to inspect the situation.
  Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: No matching distribution found for tensorflow~=2.11.0

have you seen this before? do you know how I'd get around it? should I just run pipenv install --skip-lock? although that appears to not be what I need to do at all. the error message could be better from pip TBH

@atheo89
Copy link
Member

atheo89 commented Sep 4, 2023

@atheo89 I've managed to piplock env all but the tensorflow image with this error:

...
✘ Locking Failed!
⠹ Locking...
CRITICAL:pipenv.patched.pip._internal.resolution.resolvelib.factory:Could not find a version that satisfies the requirement tensorflow~=2.11.0 (from versions: 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.14.0rc0, 2.14.0rc1)
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 811, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 759, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 738, in resolve
[ResolutionFailure]:       return resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 1096, in resolve_deps
[ResolutionFailure]:       results, hashes, markers_lookup, resolver, skipped = actually_resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 895, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 684, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=str(e))
[pipenv.exceptions.ResolutionFailure]: Warning: Your dependencies could not be resolved. You likely have a mismatch in your sub-dependencies.
  You can use $ pipenv install --skip-lock to bypass this mechanism, then run $ pipenv graph to inspect the situation.
  Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: No matching distribution found for tensorflow~=2.11.0

have you seen this before? do you know how I'd get around it? should I just run pipenv install --skip-lock? although that appears to not be what I need to do at all. the error message could be better from pip TBH

It appears there is an inconsistency involving the tensorflow~=2.11.0 requirement. You might want to consider utilizing an earlier version of the Codeflare SDK.
Try to run the make refresh-pipfilelock-files cmd on the same level with the makefile

@dimakis
Copy link
Author

dimakis commented Sep 4, 2023

@atheo89 I've managed to piplock env all but the tensorflow image with this error:

...
✘ Locking Failed!
⠹ Locking...
CRITICAL:pipenv.patched.pip._internal.resolution.resolvelib.factory:Could not find a version that satisfies the requirement tensorflow~=2.11.0 (from versions: 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.14.0rc0, 2.14.0rc1)
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 811, in _main
[ResolutionFailure]:       resolve_packages(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 759, in resolve_packages
[ResolutionFailure]:       results, resolver = resolve(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/resolver.py", line 738, in resolve
[ResolutionFailure]:       return resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 1096, in resolve_deps
[ResolutionFailure]:       results, hashes, markers_lookup, resolver, skipped = actually_resolve_deps(
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 895, in actually_resolve_deps
[ResolutionFailure]:       resolver.resolve()
[ResolutionFailure]:   File "/opt/homebrew/lib/python3.11/site-packages/pipenv/utils/resolver.py", line 684, in resolve
[ResolutionFailure]:       raise ResolutionFailure(message=str(e))
[pipenv.exceptions.ResolutionFailure]: Warning: Your dependencies could not be resolved. You likely have a mismatch in your sub-dependencies.
  You can use $ pipenv install --skip-lock to bypass this mechanism, then run $ pipenv graph to inspect the situation.
  Hint: try $ pipenv lock --pre if it is a pre-release dependency.
ERROR: No matching distribution found for tensorflow~=2.11.0

have you seen this before? do you know how I'd get around it? should I just run pipenv install --skip-lock? although that appears to not be what I need to do at all. the error message could be better from pip TBH

It appears there is an inconsistency involving the tensorflow~=2.11.0 requirement. You might want to consider utilizing an earlier version of the Codeflare SDK. Try to run the make refresh-pipfilelock-files cmd on the same level with the makefile

Even running it in a fresh dir, with a fresh venv won't allow me to $ pip install tensorflow==2.11.0
I come away with this error:
ERROR: Could not find a version that satisfies the requirement tensorflow==2.11.0 (from versions: 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.14.0rc0, 2.14.0rc1)
ERROR: No matching distribution found for tensorflow==2.11.0`

Can the tensflow version be bumped, the current release is 2.13? 🤔 This is surely going to be a re-occurring problem and not one that I alone will face?

@atheo89
Copy link
Member

atheo89 commented Sep 5, 2023

We will indeed be updating all the notebooks in the upcoming release. Since codeflare-sdk is a new addition will be scheduled for the 2023b release at the end of October, I recommend listing it under this specific issue. This will ensure that our team takes it into consideration during the upgrade process.

@harshad16
Copy link
Member

Thanks for the PR. 💯
Taking a look at the issues, pertaining to the CI.

  • Ack, on not including the package in the py 3.8 stacks.
    we are okay not to have it in there.

@atheo89
Copy link
Member

atheo89 commented Sep 8, 2023

Closing this PR as it will be addressed as part of the upcoming notebooks upgrade.
#205

@atheo89 atheo89 closed this Sep 8, 2023
@harshad16 harshad16 reopened this Sep 8, 2023
@harshad16
Copy link
Member

The codeflare would be also include in #205 and with these.
as we would like this in 2023.a as well.

@harshad16 harshad16 force-pushed the addition-of-codeflare-sdk branch from 3fcd893 to e22b8cb Compare September 25, 2023 14:51
@harshad16 harshad16 force-pushed the addition-of-codeflare-sdk branch from e22b8cb to c5333cb Compare September 25, 2023 15:57
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 26, 2023

@dimakis: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/jupyter-datascience-anaconda-python-3-8-pr-image-mirror 3fcd893 link true /test jupyter-datascience-anaconda-python-3-8-pr-image-mirror
ci/prow/images c5333cb link true /test images
ci/prow/notebooks-e2e-tests c5333cb link true /test notebooks-e2e-tests
ci/prow/notebook-habana-1-11-0-ubi8-python-3-8-pr-image-mirror c5333cb link true /test notebook-habana-1-11-0-ubi8-python-3-8-pr-image-mirror
ci/prow/notebook-habana-1-9-0-ubi8-python-3-8-pr-image-mirror c5333cb link true /test notebook-habana-1-9-0-ubi8-python-3-8-pr-image-mirror
ci/prow/notebook-habana-1-10-0-ubi8-python-3-8-pr-image-mirror c5333cb link true /test notebook-habana-1-10-0-ubi8-python-3-8-pr-image-mirror
ci/prow/habana-notebooks-e2e-tests c5333cb link true /test habana-notebooks-e2e-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@harshad16
Copy link
Member

codeflare-sdk is installing torch package with its installation.
Which we cant have in data-science package.

seems like code-flare-SDK requires: pytorch-lighting
https://github.com/project-codeflare/codeflare-sdk/blob/54a5a12a103807590b26ab076ea4032d657451fb/pyproject.toml#L32

which install torch
https://github.com/Lightning-AI/lightning/blob/8345689f1cd328ae956fa1638ed9c7d2eced49d5/requirements/pytorch/base.txt#L5

This disrupt the whole notebook flow.
/hold

@harshad16
Copy link
Member

Closing this PR,
continuing to work here: #263

@harshad16 harshad16 closed this Oct 16, 2023
atheo89 added a commit to atheo89/notebooks that referenced this pull request Apr 4, 2024
harshad16 pushed a commit to harshad16/odh-notebooks that referenced this pull request Apr 5, 2024
jiridanek pushed a commit to jiridanek/notebooks that referenced this pull request Jun 6, 2024
jiridanek pushed a commit to jiridanek/notebooks that referenced this pull request Jun 6, 2024
These changes shouldn't have any functional impact.

[fix] CI for the images checks based on recent updates

[fix] This fixes an inconsistency with the kustomize params

Inconsistency for codeserver notebook parameters. There was upstream
change recently that probably not got properly backported to downstream,
see [1,2].

* [1] opendatahub-io#524
* [2] red-hat-data-services@ceb3dc8

Set the rstudio builds with the branch rhoai-2.10

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update image commits for release N via digest-updater-9215094498 GitHub action

Update images for release N via digest-updater-9215094498 GitHub action

Update file via  digest-updater-9213110410 GitHub action

Allow runtime script to cp the package from bin to Rpackage default path

Update codeflare-sdk version on imagestreams annotations (opendatahub-io#235)

* Update codeflare-sdk version on imagestreams annotations
* fix kfp version in the annotation for tensorflow

Co-authored-by: Jan Stourac <jstourac@redhat.com>

Update images for release N and N-1 with 2024a commit db8bd76

Update file via  digest-updater-8806399693 GitHub action

Update annotations for kfp (opendatahub-io#229)

Update image commits for release N via digest-updater-8665769109 GitHub action

Update images for release N via digest-updater-8665769109 GitHub action

Update manifest for code-freeze 2.9

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update image commits for release N-1 via digest-updater-8581586298 GitHub action

Update images for release N-1 via digest-updater-8581586298 GitHub action

Update image commits for release N via digest-updater-8581586298 GitHub action

Update images for release N via digest-updater-8581586298 GitHub action

Update file via  digest-updater-8577545330 GitHub action

Fix the runtime updater github action branch 2024a

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Fix the runtime updater github action

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Remove the intel based image from the overlay as its ODH only
- Fix the typo in the datascience notebook

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Revert nginx version to 1.22 since 1.24 is not available on rhel yet

update cuda layer for RHEL to 12.1

Add runtimes workflow updater

Update digest updater workflow

Fix check-params-env test with the new changes (opendatahub-io#196)

Update Imagesteam for habana 1.13

Update runtime images with e1aee40 build commit

Update the manifests to retain old image in shadow state

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update image commits for release N via digest-updater-8319475892 GitHub action

Update images for release N via digest-updater-8319475892 GitHub action

Update Codeserver ImageStream for the 2024a release inclusion (opendatahub-io#173)

* Update Codeserver imagestream with the 2024a release

Co-authored-by: Harshad Reddy Nalla <harshadreddy16@gmail.com>

Fix test file for the trustyai image

I don't really understand how and why this file was broken by this
commit aac0662 . Our CI check notifies
that something is broken in the file.

Update Imagestreams with in favor of the new release 2024.1 (opendatahub-io#175)

Co-authored-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update digest updater workflow in favor 2024a release

Remove opendatahub.io/dashboard: 'true' label from rstudio ImageSteams

Create sync workflow for the release-2024a

Format yaml and json files to statisfy code-quality
- Fix validation of the params-env

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update RStudio-server Dockefile for RHEL version

Fix library path version on rsession.conf file

hot fix: bump cuda resources

HotFix: Remove the annotation notebook-images=true from RStudio imagestreams

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Fix user R library path version

Update image commits for release N via digest-updater-7846262944 GitHub action

Update images for release N via digest-updater-7846262944 GitHub action

Remove the R-package install from workbench

Co-authored-by: Diamond Bryant <dibryant@redhat.com>
Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Fix naming for RStudio Server on rhel flavor

Increase build resources for R Studio buildconfigs

Mount the secret on the buildConfig instead of using ENVs to avoid their expose on the logs

Adjust the imagestream annotation for codeflare-sdk upgrade

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update image commits for release N via digest-updater-7761501425 GitHub action

Update images for release N via digest-updater-7761501425 GitHub action

Add optional: true option for the base and server url envs

Add BuildConfiguration objects to build RStudio and CUDA RStudio images on OCP cluster

Fixes on the CUDA Dockerfile

setup r-studio based with rhel9 base image (opendatahub-io#125)

* Content of R Studio switched to the rhel based image.

Add rhel9 base image

[Fix] typo in logging of the `notebook-digest-updater.yaml`

Update image commits for release N via digest-updater-7533330854 GitHub action

Update images for release N-1 via digest-updater-7533330854 GitHub action

Update images for release N via digest-updater-7533330854 GitHub action

Fix: update the code-server and annotation

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>
Co-authored-by: aTheo <atheodorak@outlook.com>

Incorporate VSCode on Downstream (opendatahub-io#105)

Co-authored-by: Harshad Reddy Nalla <hnalla@redhat.com>

hot-fix: Fix the tensorflow imagestream by removing the trailing space

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

hot-fix: Fix the imagestream minimal-cuda sha

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

hot-fix: Fixed imagestream with CVE 44487 changes

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

hot-fix: update the base ubi9 images for cve 44487 fix

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

hot-fix: CVE 44487 fix with libnghttp2

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update the pipfile.lock via the weekly workflow action

chores: Update the runtime image with the commit: 8bda2fa

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Patch the imagestream by removing habana 1.11.0

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Update the runtime image with the commit: 8bda2fa on main

Update images for release N via digest-updater-6655629712 GitHub action

Fix the annotation and additional recommended-true

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Patch the imagestream to have same name as in odh-manifests

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Fix digest updater from failing if there are no updates on the image streams

Fix the path to the params.env file

Several fixes

Upgrade the notebook images with 2023b and 2023a images

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>

Include only sync github workflow on the main branch

Signed-off-by: Harshad Reddy Nalla <hnalla@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Addition of CodeFlare SDK to datascience base image
4 participants