Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-install minimal requirment python packages on vscode #374

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

atheo89
Copy link
Member

@atheo89 atheo89 commented Nov 28, 2023

Related to: #345

Description

This PR introduce a minimal list of python packages that are required for data sciences and programming workloads.

NOTE: Will follow up a separated issue/pr to incorporate the standard database clients #372

How Has This Been Tested?

  1. Spin up the vscode image generated from this PR
  2. Open a terminal inside the container and run pip list and ensure there is a long list with packages
  3. Test some of the installed packages, by creating a .py or .ipynb file, add the following code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Create a sample DataFrame
data = {'A': np.random.rand(100),
        'B': np.random.randn(100),
        'C': np.linspace(0, 1, 100)}

df = pd.DataFrame(data)

# Perform some operations using NumPy
df['D'] = np.sin(df['A']) + np.cos(df['B'])

# Plotting with Matplotlib
plt.figure(figsize=(10, 6))

plt.subplot(2, 2, 1)
plt.scatter(df['A'], df['B'])
plt.title('Scatter Plot')

plt.subplot(2, 2, 2)
plt.plot(df['C'], df['D'], color='red')
plt.title('Line Plot')

plt.subplot(2, 2, 3)
plt.hist(df['B'], bins=20, color='green', alpha=0.7)
plt.title('Histogram')

plt.subplot(2, 2, 4)
plt.boxplot(df[['A', 'C']])
plt.title('Boxplot')

plt.tight_layout()
plt.show()
  1. You shall see the following plot:
    image

  2. In case of issue check which interpreter you are using

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@atheo89 atheo89 linked an issue Nov 28, 2023 that may be closed by this pull request
1 task
Copy link
Contributor

@rkpattnaik780 rkpattnaik780 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Able to run code snippets that use the added libraries.

/approve

@shalberd
Copy link
Contributor

shalberd commented Dec 1, 2023

I was just talking about this, Python packages in VSCode, to a colleague of mine this week. So nice to see this! Looking good at first glance. Also, we have a topic of where to put VSCode extensions and make them available without download ... but that is a different story. Best regards from snowy CH.

@atheo89
Copy link
Member Author

atheo89 commented Dec 1, 2023

Glad to hear that! 🙂

Also, we have a topic of where to put VSCode extensions and make them available without download ... but that is a different story.

We are cooking something as well for this! Check out here: #347

Best regards from snowy CH.

Greetings from Turin on the other side of the Alps!

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the excellent work.
After closing checking, feels like we are install more packages then intended


# PyTorch packages
tensorboard = "~=2.15.1"
torch = {version = "~=2.1.1", index = "pytorch"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for only installing torch and not tensorflow.
on hindsight , i feel we should not install any of these big packages.
and only stick with basic packages like

boto3, matplotlib, pandas, numpy, scipy

Copy link
Contributor

@shalberd shalberd Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there is a similar issue in @guimou recent changes to the contrib datascience UI python packages, they have all sorts of heavy nvidia* libraries in it, regardless of whether the base docker image is for Tensorflow or not. opendatahub-io-contrib/workbench-images#48. Agreed that in a first step here, minimal might be good. And you could get these changes here in this PR into the snippets for codeserver VSCode at contrib, so users can then always add more of their own packages, if they want to, in own builds. Or maybe I could suggest that as a PR there ... how much are you in contact with Guillome? Having this in the image build framework he introduced would enable end users to build their custom offline / airgapped images with as many python libraries as they want. That applies to the VSCode extensions topic, too. The groundwork @atheo89 has done is great in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing this! 😊 I was in a dilemma about whether to include the most advanced packages or not. I agree with you; I'll omit them for now and stick to the basic data science packages.

@guimou
Copy link
Member

guimou commented Dec 6, 2023 via email

@shalberd
Copy link
Contributor

shalberd commented Dec 6, 2023

@guimou codeflare, I see. To keep that aspect separate for the other contrib images context, I answered in odh-contrib

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work, lgtm.

/lgtm
/approve

Thanks all

@openshift-ci openshift-ci bot added the lgtm label Dec 11, 2023
Copy link
Contributor

openshift-ci bot commented Dec 11, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshad16, rkpattnaik780

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@harshad16 harshad16 merged commit 0df7031 into opendatahub-io:main Dec 11, 2023
3 of 5 checks passed
@atheo89 atheo89 deleted the nbk-345 branch October 23, 2024 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pre-installed python packages in vscode environment
5 participants