Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added requirements.txt file with newer dependencies (Python 3.11) #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

moulai
Copy link

@moulai moulai commented Apr 10, 2024

Summary

Dear @tgasser ,

I have added a requirements.txt to the OSCAR repository to facilitate the installation and management of the project's runtime environment.

The Python version specified is 3.11.8, and the versions of the dependencies are up to date as of April 10, 2024. I have tested all files in run_scripts and checked all outputs to ensure that OSCAR runs correctly with this Python version and these dependency versions.

Additionally, I conducted tests using the environment configuration specified in the README.md (Python version 3.9.19, dependency versions as mentioned in README.md). The results indicate that using Python 3.11.8 with newer versions of dependencies approximately doubles the running speed compared to using Python 3.9.19 with the dependency versions listed in the README (the main process execution time of OSCAR is halved).

This improvement is likely due to optimizations in Python 3.11 that significantly enhance execution efficiency ("Python 3.11 is between 10-60% faster than Python 3.10. On average, we measured a 1.25x speedup on the standard benchmark suite." See: What’s New In Python 3.11). Moreover, newer versions of dependencies such as xarray, numpy, scipy, and pandas not only add functionalities (potentially including efficiency optimizations) but also fix many bugs. In summary, updating the Python version and dependency versions, while ensuring OSCAR's correct operation, is beneficial for accelerating execution speed.

If you feel more testing is needed to ensure the new version's environment configuration is reliable, I am happy to undertake such work. I am also willing to update the relevant sections about running and environment configuration in the README.md.

Please feel free to comment and @ me with any questions or needs!

Test Comparison

I conducted related tests on my PC, ensuring that conditions were essentially the same aside from the Python environment.

Running Environment

  • CPU: Intel Core i9-11900 @ 2.50GHz
  • Memory: 32GB (DDR4 2667MHz)
  • Disk: SSD (Samsung)
  • GPU: The current version of OSCAR almost does not utilize the GPU for computation, thus it is not significant.

Results

Note: The total running time refers to the sum of model execution times outputted by OSCAR.

Test File Python 3.9.19 + Dependency Versions from README Python 3.11.8 + Newer Dependency Versions Comparison
Total Running Time/mins CPU Usage Memory Usage/MB Total Running Time/mins CPU Usage Memory Usage/MB Running Time Ratio
basic_example_1.py 21.6 15% 180 10.8 20% 210 2.00x
basic_example_2.py 35.4 10% 180 16.3 18% 490 2.17x
basic_example_3.py 4.0 10% 200 1.7 15% 500 2.35x
check_LUC_structure.py 25.9 10% 14000 13.6 22% 13500 1.90x

The running time with Python 3.9 is essentially twice that of Python 3.11. Additionally, Python 3.9's CPU usage is lower than Python 3.11's, which indicates that the older version of Python does not utilize the computer's performance as efficiently.

Detailed Outputs

basic_example_1.py

Python 3.9.19 + Dependency Versions from README:

OSCAR_v3 running
year = 2014 (nt = 3)
total running time: 10.4 minutes
OSCAR_v3 running
year = 2100 (nt = 10)
total running time: 11.2 minutes

Python 3.11.8 + Newer Dependency Versions:

OSCAR_v3 running
year = 2014 (nt = 3)
total running time: 4.9 minutes
OSCAR_v3 running
year = 2100 (nt = 10)
total running time: 5.9 minutes

basic_example_2.py

Python 3.9.19 + Dependency Versions from README:

OSCAR_v3 running
year = 2011 (nt = 3)
total running time: 9.1 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 9.1 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 8.7 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 8.4 minutes

Python 3.11.8 + Newer Dependency Versions:

OSCAR_v3 running
year = 2011 (nt = 3)
total running time: 3.8 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 3.9 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 4.3 minutes
OSCAR_v3 running
year = 2011 (nt = 2)
total running time: 4.3 minutes

basic_example_3.py

Python 3.9.19 + Dependency Versions from README:

OSCAR_v3_landC running
year = 2018 (nt = 3)
total running time: 4.0 minutes

Python 3.11.8 + Newer Dependency Versions:

OSCAR_v3_landC running
year = 2018 (nt = 3)
total running time: 1.7 minutes

check_LUC_structure.py

Python 3.9.19 + Dependency Versions from README:

OSCAR_v3_landC running
year = 2018 (nt = 3)
total running time: 5.0 minutes
OSCAR_v3_landC_split_LUC running
year = 2018 (nt = 3)
total running time: 9.0 minutes
OSCAR_v3_landC_full_LUC running
year = 2018 (nt = 3)
total running time: 5.8 minutes
OSCAR_v3_landC_lite_LUC running
year = 2018 (nt = 3)
total running time: 4.6 minutes
OSCAR_v3_landC_cut_LUC running
year = 2018 (nt = 3)
total running time: 1.5 minutes

Python 3.11.8 + Newer Dependency Versions:

OSCAR_v3_landC running
year = 2018 (nt = 3)
total running time: 2.6 minutes
OSCAR_v3_landC_split_LUC running
year = 2018 (nt = 3)
total running time: 5.3 minutes
OSCAR_v3_landC_full_LUC running
year = 2018 (nt = 3)
total running time: 2.9 minutes
OSCAR_v3_landC_lite_LUC running
year = 2018 (nt = 3)
total running time: 2.0 minutes
OSCAR_v3_landC_cut_LUC running
year = 2018 (nt = 3)
total running time: 0.8 minutes

Appendix

requirements.txt

Python 3.9.19

xarray == 0.20.1
netCDF4 == 1.5.7
numpy == 1.23.3
scipy == 1.9.3
matplotlib == 3.8.4
networkx == 3.2.1
pandas == 1.3.5

Please note that xarray == 0.20.1 can only be used with pandas == 1.3.5, using a higher version of pandas will result in an error.

Python 3.11.8

xarray == 2024.3.0
netCDF4 == 1.6.5
numpy == 1.26.4
scipy == 1.13.0
matplotlib == 3.8.4
networkx == 3.3
pandas == 2.2.1

environment.yml

I created two brand-new Python environments using conda to test their differences. Below are the exported environment configurations for Python 3.9.19 and Python 3.11.8, respectively.

Python 3.9.19

name: oscar-39
channels:
  - defaults
dependencies:
  - ca-certificates=2024.3.11=haa95532_0
  - openssl=3.0.13=h2bbff1b_0
  - pip=23.3.1=py39haa95532_0
  - python=3.9.19=h1aa4202_0
  - setuptools=68.2.2=py39haa95532_0
  - sqlite=3.41.2=h2bbff1b_0
  - vc=14.2=h21ff451_1
  - vs2015_runtime=14.27.29016=h5e58377_2
  - wheel=0.41.2=py39haa95532_0
  - pip:
      - cftime==1.6.3
      - contourpy==1.2.1
      - cycler==0.12.1
      - fonttools==4.51.0
      - importlib-resources==6.4.0
      - kiwisolver==1.4.5
      - matplotlib==3.8.4
      - netcdf4==1.5.7
      - networkx==3.2.1
      - numpy==1.23.3
      - packaging==24.0
      - pandas==1.3.5
      - pillow==10.3.0
      - pyparsing==3.1.2
      - python-dateutil==2.9.0.post0
      - pytz==2024.1
      - scipy==1.9.3
      - six==1.16.0
      - tzdata==2024.1
      - xarray==0.20.1
      - zipp==3.18.1

Python 3.11.8

name: oscar-311
channels:
  - defaults
dependencies:
  - bzip2=1.0.8=h2bbff1b_5
  - ca-certificates=2024.3.11=haa95532_0
  - libffi=3.4.4=hd77b12b_0
  - openssl=3.0.13=h2bbff1b_0
  - pip=23.3.1=py311haa95532_0
  - python=3.11.8=he1021f5_0
  - setuptools=68.2.2=py311haa95532_0
  - sqlite=3.41.2=h2bbff1b_0
  - tk=8.6.12=h2bbff1b_0
  - vc=14.2=h21ff451_1
  - vs2015_runtime=14.27.29016=h5e58377_2
  - wheel=0.41.2=py311haa95532_0
  - xz=5.4.6=h8cc25b3_0
  - zlib=1.2.13=h8cc25b3_0
  - pip:
      - certifi==2024.2.2
      - cftime==1.6.3
      - contourpy==1.2.1
      - cycler==0.12.1
      - fonttools==4.51.0
      - kiwisolver==1.4.5
      - matplotlib==3.8.4
      - netcdf4==1.6.5
      - networkx==3.3
      - numpy==1.26.4
      - packaging==24.0
      - pandas==2.2.1
      - pillow==10.3.0
      - pyparsing==3.1.2
      - python-dateutil==2.9.0.post0
      - pytz==2024.1
      - scipy==1.13.0
      - six==1.16.0
      - tzdata==2024.1
      - xarray==2024.3.0

Other Notes

Using the new version of xarray when running test files in run_scripts produces a DeprecationWarning which does not affect program execution.

DeprecationWarning: dropping variables using `drop` is deprecated; use drop_vars.

I am planning to try to further optimize OSCAR's running efficiency by utilizing the numba library, optimizing algorithms, and implementing parallel computing, all while making minimal changes to the existing code. I'd greatly value your opinion on this initiative. Would this effort be of interest to you, or do you believe it might not be necessary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant