Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spring cleaning for dependencies #705

Closed
3 tasks
lbianchi-lbl opened this issue Feb 11, 2022 · 11 comments · Fixed by #1133
Closed
3 tasks

Spring cleaning for dependencies #705

lbianchi-lbl opened this issue Feb 11, 2022 · 11 comments · Fixed by #1133
Labels
discussion Discussion enhancement New feature or request Priority:High High Priority Issue or PR

Comments

@lbianchi-lbl
Copy link
Contributor

lbianchi-lbl commented Feb 11, 2022

Goals

  • Get rid of dependencies that are not needed anymore
  • Separate remaining dependencies in two broad categories:
    • core: used extensively throughout the IDAES codebase
    • optional: only used in a limited subset of submodules and/or applications, i.e. not all IDAES users will need them
  • Divide optional dependencies in a reasonable amount (2 <= n < 8) of categories to be used as extras_require, i.e. pip install idaes-pse[my-dependencies]

Current status

Requirement Imported in modules Notes Effort to make optional
backports.shutil_get_terminal_size Only needed for Python <3.3 0
bunch 0
click >9000
colorama idaes.dmf.util 1
flask UI, but probably not needed anymore? 0
flask-cors Same 0
jupyter Used for the examples 2
pywin32 Only needed for version constraint 2
lxml idaes.dmf.util 1
matplotlib ~30 3
nbconvert idaes.commands.examples 1
nbformat ideas.commands.examples, idaes.dmf 1
numpy ~90 5
networkx pyomo sequential modular tool, and by extension many flowsheet examples Easy to make optional, but used fairly extensively 1
omlt idaes.surrogate.keras_surrogate 1
pandas ~50 4
pint all IDAES models, idaes.dmf.model_data >9000
psutil 0
pyomo ~500 >9000
pytest ~300 test_* and conftest.py 2
pyyaml idaes.generic_models.properties.core.coolprop, idaes.config, idaes.dmf 3
requests idaes.ui.fsvis 1
python-slugify idaes.ui.fsvis 1
scipy ~20 3
seaborn idaes.dmf.model_data, idaes.core.base.costing_base(future) Already optional manual install 1
sympy idaes.core.util.expr_doc, idaes.surrogate.helmet Helmet should not be considered too heavily in this, it will soon end up in the "contrib" section 2
tinydb idaes.dmf 1
rbfopt idaes.surrogate.ripe This is contrib code, and this should definitely be an optional dependency 1
ipython<8.0.0 Only needed for version constraint 2
tensorflow idaes.surrogate.keras_surrogate Already has conditional import 1
coolprop idaes.generic_models.properties.core.coolprop + test Already has conditional import 1

(table generated with the useful tool https://www.tablesgenerator.com/markdown_tables)

Effort to make optional:

  • 0: can be removed
  • 1: extras_require + conditional import
  • 2: Like 1, with caveats
  • 3: Possible but non-negligible amount of work required; probably not worth it
  • 4: Possible, but a lot of work; most likely not worth it
  • 5: Not impossible, but realistically not feasible
  • >5: "Let's rewrite Pyomo from scratch"

Proposed grouping

Option A

Core dependencies

  • pyomo
  • pint
  • numpy
  • pandas
  • matplotlib
  • networkx
  • pyyaml
  • scipy
  • click

Optional dependencies in extras_require

  • testing
    • pytest
  • examples OR jupyter OR notebooks
    • jupyter
    • nbconvert
    • nbformat
    • ipython (version constraint only)
    • pywin32 (version constraint only)
  • dmf
    • colorama
    • lxml
    • tinydb
    • nbformat
  • ui OR fsvis
    • requests
    • python-slugify
  • omlt OR keras OR mlai etc
    • omlt
    • tensorflow

Optional dependencies (standalone)

  • sympy
  • coolprop - this is also useful for verification testing of property models.
  • rbfopt
@lbianchi-lbl lbianchi-lbl added enhancement New feature or request discussion Discussion labels Feb 11, 2022
@lbianchi-lbl lbianchi-lbl self-assigned this Feb 11, 2022
@andrewlee94
Copy link
Member

andrewlee94 commented Feb 11, 2022

@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g. pint is basically required for all IDAES models through our use of pyomo's unit conversion tools.

@lbianchi-lbl
Copy link
Contributor Author

@lbianchi-lbl Do you want us to update the table above if we have comments to make? E.g. pint is basically required for all IDAES models through our use of pyomo's unit conversion tools.

@andrewlee94 that's an excellent point and yes, please feel free to add corrections. I've started by looking at modules that are directly imported by IDAES code, but this ignores cases such as pint where the module is used extensively even though it's not imported.

Also, the table might be finicky to edit by hand, so you can write corrections in a comment and I can then amend the table if that seems easier.

@andrewlee94
Copy link
Member

On the topic of things like pytest - these are only used by the tests, an thus many users probably aren't that concerned. However, we do (or at least did) tell users to run tests as part of the installation.

Personally, I would be inclined to make things like this a developer-only dependency to keep the burden on the user low, but we would need to come up with an alternative for testing an IDAES install - running a prebuilt example notebook would probably be a good alternative, and honestly more useful to a general user.

@andrewlee94
Copy link
Member

@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.

I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g. rbfopt), and I don't think we should go out of our way to support these cases.

@lbianchi-lbl
Copy link
Contributor Author

@andrewlee94 - thanks for the corrections for pint, networkx, and pyyaml. It seems that these should be part of the core dependencies, also considering that (AFAIK) the payoff from making them optional doesn't seem very substantial.

For pytest, I get what you're saying in that its primary use is by IDAES contributors, which would make it part of the separate dev dependencies in requirements-dev.txt. However, being able to use the test suite to verify that IDAES has been set up correctly is useful not only for end users, but also for automated testing of any "site-packages" (non-editable) installation, including when cutting a release. We've been using this extensively with @ksbeattie, and makes us inclined towards considering pytest as an optional dependency (installable as e.g. idaes-pse[testing]), rather than an exclusively dev-only requirement.

@lbianchi-lbl
Copy link
Contributor Author

@lbianchi-lbl I would also suggest that we might also want to have some further "use-case" only dependencies - i.e. things that are only used by one (sub-)package in the code. These we would just have a check for in the code, and tell the user that they need to get the dependency themselves.

I can see a few cases above where a "contrib-quality" package is bringing in a new dependency (e.g. rbfopt), and I don't think we should go out of our way to support these cases.

@andrewlee94 I agree, that's another good point. As I see it, the main criteria for deciding whether a dependency mypkg would be "worthy" of an extras_require tag are:

  • Is mypkg only used/useful as part of a "bundle" with other dependencies?
  • Is mypkg required by substantial parts of the codebase such as not including it (and therefore, let the tests that require it be skipped) is going to affect the test coverage significantly?
  • Does mypkg have any particular version and/or platform-specific constraints, such that it would make sense to encode them in a requirement directive within setup.py?

For any of the dependencies that we're classifying as optional, we could use these criteria to decide the extras_require tag (if any) that it will be grouped into.

@lbianchi-lbl
Copy link
Contributor Author

I've edited the original comment adding a first proposed grouping for the dependencies, as "Option A". Feel free to add your thoughts, either in a comment to the issue or by copying "Option A", making the changes, and posting it under a different identifier.

@ksbeattie
Copy link
Member

Once this is sorted out, we need to update this page in our docs: https://idaes-pse.readthedocs.io/en/stable/tutorials/getting_started/opt_dependencies.html#optional-dependencies

@eslickj
Copy link
Member

eslickj commented Apr 30, 2022

I added seaborn to the table. I used it in one place and you currently need to manually install if you want it. Someone also commented that they may use it in the future. It's definitely very optional.

@ksbeattie
Copy link
Member

This might be a good candidate for making at least some of these changes immediately after the Aug release.

@lbianchi-lbl
Copy link
Contributor Author

lbianchi-lbl commented Mar 2, 2023

Summarizing the next steps for this:

  • As a low-hanging fruit, make a first pass and get rid of all dependencies that can be removed with no or minimal changes to the code (e.g. something is imported but not actually used)
  • The biggest payoff would probably be separating the Jupyter stack, which can be done once idaes get-examples and the DMF are removed and/or adapted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion enhancement New feature or request Priority:High High Priority Issue or PR
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

8 participants