Skip to content

Commit 8b386e9

Browse files
authored
Update README.md modin vs pandas
1 parent 90a9fe4 commit 8b386e9

File tree

1 file changed

+120
-56
lines changed
  • AI-and-Analytics/Getting-Started-Samples/Modin_Vs_Pandas

1 file changed

+120
-56
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# Modin* Vs. Pandas Performance Sample
1+
# Modin Vs. Pandas Performance Sample
22

3-
The `Modin* Vs. Pandas Performance` code illustrates how to use Modin* to replace the Pandas API. The sample compares the performance of Modin* and the performance of Pandas for specific dataframe operations.
3+
The `Modin Vs. Pandas Performance` code illustrates how to use Modin* to replace the Pandas API. The sample compares the performance of Modin and the performance of Pandas for specific dataframe operations.
44

55
| Area | Description
66
|:--- |:---
77
| Category | Concepts and Functionality
8-
| What you will learn | How to accelerate the Pandas API using Modin*.
8+
| What you will learn | How to accelerate the Pandas API using Modin.
99
| Time to complete | Less than 10 minutes
1010

1111
## Purpose
@@ -19,77 +19,138 @@ You can run the sample locally or in Google Colaboratory (Colab).
1919
|:--- |:---
2020
| OS | Ubuntu* 20.04 (or newer)
2121
| Hardware | Intel® Core™ Gen10 Processor <br> Intel® Xeon® Scalable Performance processors
22-
| Software | Modin*
22+
| Software | Intel® Distribution of Modin*
23+
24+
> **Note**: AI and Analytics samples are validated on AI Tools Offline Installer. For the full list of validated platforms refer to [Platform Validation](https://github.com/oneapi-src/oneAPI-samples/tree/master?tab=readme-ov-file#platform-validation).
25+
<!-- for migrated samples - modify the note above to provide information on samples validation and preferred installation option -->
2326
2427
## Key Implementation Details
2528

26-
This code sample is implemented for CPU using Python programming language. The sample requires NumPy, Pandas, Modin* libraries, and the time module in Python.
29+
This code sample is implemented for CPU using Python programming language. The sample requires NumPy, Pandas, Modin libraries, and the time module in Python.
2730

2831
## Environment Setup
2932

30-
If you want to run the sample on a local system using a command-line interface (CLI), you must install the Modin in a new Conda* environment first.
33+
You will need to download and install the following toolkits, tools, and components to use the sample.
34+
<!-- Use numbered steps instead of subheadings -->
3135

32-
### Install Modin*
36+
**1. Get AI Tools**
3337

34-
1. Create a Conda environment.
35-
```
36-
conda create --name modin
37-
```
38-
2. Activate the Conda environment.
39-
```
40-
source activate modin
41-
```
42-
3. Remove existing versions of Modin* (if any exist).
43-
```
44-
conda remove modin --y
45-
```
46-
4. Install Modin (v0.12.1 or newer).
47-
```
48-
pip install modin[all]==0.12.1
49-
```
50-
5. Install the NumPy and Pandas libraries.
51-
```
52-
pip install numpy
53-
pip install pandas
54-
```
55-
6. Install ipython to run the notebook on your system.
56-
```
57-
pip install ipython
58-
```
59-
### Run the Sample
38+
Required AI Tools: Modin
6039

61-
1. Change to the directory containing the `Modin_Vs_Pandas.ipynb` notebook file on your local system.
40+
If you have not already, select and install these Tools via [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html). AI and Analytics samples are validated on AI Tools Offline Installer. It is recommended to select Offline Installer option in AI Tools Selector.
6241

63-
2. Run the sample notebook.
64-
```
65-
ipython Modin_Vs_Pandas.ipynb
66-
```
42+
>**Note**: If Docker option is chosen in AI Tools Selector, refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples.
6743
68-
## Run the `Modin* Vs Pandas Performance` Sample in Google Colaboratory
44+
**2. (Offline Installer) Activate the AI Tools bundle base environment**
45+
<!-- this step is from AI Tools GSG, please don't modify unless GSG is updated -->
46+
If the default path is used during the installation of AI Tools:
47+
```
48+
source $HOME/intel/oneapi/intelpython/bin/activate
49+
```
50+
If a non-default path is used:
51+
```
52+
source <custom_path>/bin/activate
53+
```
6954

70-
1. Change to the directory containing the `Modin_Vs_Pandas.ipynb` notebook file on your local system.
55+
56+
**3. (Offline Installer) Activate relevant Conda environment**
57+
<!-- specify relevant conda environment name in Offline Installer for this sample -->
58+
```
59+
conda activate modin
60+
```
61+
62+
**4. Clone the GitHub repository**
63+
<!-- for oneapi-samples: git clone https://github.com/oneapi-src/oneAPI-samples.git
64+
cd oneAPI-samples/AI-and-Analytics/<samples-folder>/<individual-sample-folder> -->
65+
<!-- for migrated samples - provide git clone command for individual repo and cd to sample dir -->
66+
```
67+
git clone https://github.com/oneapi-src/oneAPI-samples.git
68+
cd oneAPI-samples/AI-and-Analytics/Getting-Started-Samples/Modin_Vs_Pandas
69+
```
7170

72-
2. Open the notebook file, and remove the prepended number sign (#) symbol from the following lines:
73-
```
74-
#!pip install modin[all]==0.12.1
75-
#!pip install numpy
76-
#!pip install pandas
77-
```
78-
These changes will install the Modin and the NumPy and Pandas libraries when run in the Colab notebook.
71+
**5. Install dependencies**
72+
<!-- It is required to have requirement.txt file in sample dir. It should list additional libraries, such as matplotlib, ipykernel etc. -->
73+
>**Note**: Before running the following commands, make sure your Conda/Python environment with AI Tools installed is activated
7974
80-
3. Save your changes.
75+
```
76+
pip install -r requirements.txt
77+
pip install notebook
78+
```
79+
For Jupyter Notebook, refer to [Installing Jupyter](https://jupyter.org/install) for detailed installation instructions.
8180

82-
4. Open [Google Colaboratory](https://colab.research.google.com/?utm_source=scs-index).
81+
## Run the Sample
82+
>**Note**: Before running the sample, make sure [Environment Setup](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/INC-Quantization-Sample-for-PyTorch#environment-setup) is completed.
8383
84-
5. Sign in to Colab using your Google account.
84+
Go to the section which corresponds to the installation method chosen in [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html) to see relevant instructions:
85+
* [AI Tools Offline Installer (Validated)](#ai-tools-offline-installer-validated)
86+
* [Conda/PIP](#condapip)
87+
* [Docker](#docker)
88+
<!-- for migrated samples - it's acceptable to change the order of the sections based on the validated/preferred installation options. However, all 3 sections (Offline, Conda/PIP, Docker) should be present in the doc -->
89+
### AI Tools Offline Installer (Validated)
8590

86-
6. Select **File** > **Upload notebook**.
91+
**1. Register Conda kernel to Jupyter Notebook kernel**
8792

88-
7. Upload the modified notebook file.
93+
If the default path is used during the installation of AI Tools:
94+
```
95+
$HOME/intel/oneapi/intelpython/envs/modin/bin/python -m ipykernel install --user --name=modin
96+
```
97+
If a non-default path is used:
98+
```
99+
<custom_path>/bin/python -m ipykernel install --user --name=modin
100+
```
89101

90-
8. Change to the notebook, and click **Open**.
102+
**2. Launch Jupyter Notebook**
103+
<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root -->
104+
```
105+
jupyter notebook --ip=0.0.0.0
106+
```
107+
**3. Follow the instructions to open the URL with the token in your browser**
91108

92-
9. Select **Runtime** > **Run all**.
109+
**4. Select the Notebook**
110+
<!-- add sample file name -->
111+
```
112+
Modin_Vs_Pandas.ipynb
113+
```
114+
**5. Change the kernel to `modin`**
115+
<!-- specify relevant kernel name(s), for example `pytorch` -->
116+
**6. Run every cell in the Notebook in sequence**
117+
118+
119+
### Conda/PIP
120+
> **Note**: Before running the instructions below, make sure your Conda/Python environment with AI Tools installed is activated
121+
122+
**1. Register Conda/Python kernel to Jupyter Notebook kernel**
123+
<!-- keep placeholders in this step, user could use any name for Conda/PIP env -->
124+
For Conda:
125+
```
126+
<CONDA_PATH_TO_ENV>/bin/python -m ipykernel install --user --name=<your-env-name>
127+
```
128+
To know <CONDA_PATH_TO_ENV>, run `conda env list` and find your Conda environment path.
129+
130+
For PIP:
131+
```
132+
python -m ipykernel install --user --name=<your-env-name>
133+
```
134+
**2. Launch Jupyter Notebook**
135+
<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root -->
136+
```
137+
jupyter notebook --ip=0.0.0.0
138+
```
139+
**3. Follow the instructions to open the URL with the token in your browser**
140+
141+
**4. Select the Notebook**
142+
```
143+
Modin_Vs_Pandas.ipynb
144+
```
145+
**5. Change the kernel to `<your-env-name>`**
146+
<!-- leave <your-env-name> as a placeholder as user could choose any name for the env -->
147+
148+
**6. Run every cell in the Notebook in sequence**
149+
150+
### Docker
151+
AI Tools Docker images already have Get Started samples pre-installed. Refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples.
152+
153+
<!-- Remove Intel® DevCloud section or other outdated sections -->
93154

94155
## Example Output
95156

@@ -109,8 +170,11 @@ Example expected cell output is included in `Modin_Vs_Pandas.ipynb`.
109170
## License
110171

111172
Code samples are licensed under the MIT license. See
112-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
173+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
174+
for details.
113175

114-
Third party program licenses are at [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
176+
Third party program Licenses can be found here:
177+
[third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
115178

116179
*Other names and brands may be claimed as the property of others. [Trademarks](https://www.intel.com/content/www/us/en/legal/trademarks.html)
180+

0 commit comments

Comments
 (0)