Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.3 update #121

Merged
merged 231 commits into from
Sep 10, 2024
Merged

V0.3 update #121

merged 231 commits into from
Sep 10, 2024

Conversation

mosesyhc
Copy link
Member

@mosesyhc mosesyhc commented Jul 29, 2024

List for inclusion in surmise v0.3 update:

Maintenance + tests:

Functionality:

wildsm and others added 30 commits September 26, 2023 19:29
Bringing develop back up to date
include optional dependencies in pyproject.toml dynamic field
updating ptmc-removal branch of setup configurations
Removal of PTMC (depends on ptemcee for which maintenance has been dropped)
Was able to run all tox tasks successfully.  Did quick review of coverage
reports and sphinx output.  All looked acceptable.  Flake8 shows all passing,
which is as to be expected since it is already integrated in the repo.  Sphinx
output printed a few warnings.
Built the HTML form of the docs and all looks good.  The language should be
cleaned up during PR review.
The 3.8/macOS python-package job failed due to a scipy package with a
wrong/suspicious hash.  It was trying to install 1.4, which is less than the
minimum value of 1.7 specified for the package.  Trying 1.7 to see if it passes
again.  Sphinx package builds failed with a strange error.  I find in stack
overflow that this error message can be related with shallow git clones.
Asking action to download the full repo as suggested to see if that fixes it.
Trying to see if updating to latest version of GH action tools works.
@jared321
Copy link
Contributor

jared321 commented Sep 4, 2024

From building the wheel, I see a future warning

Compiling surmise/emulationsupport/matern_covmat.pyx because it depends on /private/var/folders/m5/glqmglsx1nq7c8_cv23p7zv80000gp/T/build-env-372ga1cq/lib/python3.12/site-packages/Cython/Includes/libc/string.pxd.
[1/1] Cythonizing surmise/emulationsupport/matern_covmat.pyx
/private/var/folders/m5/glqmglsx1nq7c8_cv23p7zv80000gp/T/build-env-372ga1cq/lib/python3.12/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /Users/jared/Projects/BAND/surmise/surmise/emulationsupport/matern_covmat.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)

When running the test suite I get the following warning

tests/test_emu_cal/test_emu_pcgpwimpute.py::test_imputemethod[RandomForest-expectation2]
  /Users/jared/local/venv/surmise_test/lib/python3.12/site-packages/sklearn/impute/_iterative.py:825: ConvergenceWarning: [IterativeImputer] Early stopping criterion not reached.

Is this acceptable?

Citations 2 & 9 are already published. Don't need "To Appear."

Citations 5 & 7 are missing their links.

Citation 6 is already published
https://journals.aps.org/prc/abstract/10.1103/PhysRevC.108.054905

Collaborators have name/affilitions, but contributors don't. Is inconsistency intended?

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 4, 2024

From building the wheel, I see a future warning

Compiling surmise/emulationsupport/matern_covmat.pyx because it depends on /private/var/folders/m5/glqmglsx1nq7c8_cv23p7zv80000gp/T/build-env-372ga1cq/lib/python3.12/site-packages/Cython/Includes/libc/string.pxd.
[1/1] Cythonizing surmise/emulationsupport/matern_covmat.pyx
/private/var/folders/m5/glqmglsx1nq7c8_cv23p7zv80000gp/T/build-env-372ga1cq/lib/python3.12/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /Users/jared/Projects/BAND/surmise/surmise/emulationsupport/matern_covmat.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)

Addressed by specifying the language level in setup.py

When running the test suite I get the following warning

tests/test_emu_cal/test_emu_pcgpwimpute.py::test_imputemethod[RandomForest-expectation2]
  /Users/jared/local/venv/surmise_test/lib/python3.12/site-packages/sklearn/impute/_iterative.py:825: ConvergenceWarning: [IterativeImputer] Early stopping criterion not reached.

Is this acceptable?

scikit-learn, when imported, seems to reset all warnings behavior. I have attempted to address this warning behavior by warnings.filterwarning('ignore', category=sklearn.exceptions.ConvergenceWarning) in the test function. The warning is expected given small sample size, so that tests do not run for too long. However, this attempt to ignore the warning in test has no effect.

Citations 2 & 9 are already published. Don't need "To Appear."

Citations 5 & 7 are missing their links.

Citation 6 is already published https://journals.aps.org/prc/abstract/10.1103/PhysRevC.108.054905

Addressed in docs/surmise.bib.

Collaborators have name/affilitions, but contributors don't. Is inconsistency intended?

The contributors are listed from interactions in raising Github issues and including specific interfacing codes. That is why I kept them in Github usernames.

@jared321
Copy link
Contributor

jared321 commented Sep 4, 2024

I did not see any matplotlib graphics when I ran the examples this time. This makes running the examples as python scripts less than useful.

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 4, 2024

I did not see any matplotlib graphics when I ran the examples this time. This makes running the examples as python scripts less than useful.

Do you think we should have the plots saved? Or should we leave it as before, where users have to manually close the figure to advance the code?

@jared321
Copy link
Contributor

jared321 commented Sep 4, 2024

What's the argument for non-blocking graphics?

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 4, 2024

What's the argument for non-blocking graphics?

When the examples were developed as notebooks, the code is not blocked by the plot. But when the examples are used as scripts, the code should not be blocked until each plot is closed. Perhaps it is my habit, but I tend to use scripts when I do not expect to interact with the code.

One possible solution is to save the figures, or only use the notebooks as examples.

@jared321
Copy link
Contributor

jared321 commented Sep 5, 2024

I usually have my scripts write graphics to file rather than have them be interactive as well. However, if you do that for examples you might want to at least print to stdout that you are writing files and print their names.

In terms of seeing how surmise is used, the notebooks seem more powerful and sufficient. You also provide links to them so that users don't even need to run the notebooks.

  • Are you also providing the examples as an extra set of code that users can run to test their installation?
  • Do you expect that some users would prefer to run such tests without having to install Jupyter?
  • Are you creating the scripts directly from the notebooks? How significant is the effort to maintain the script and notebook synchronized?

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 5, 2024

I usually have my scripts write graphics to file rather than have them be interactive as well. However, if you do that for examples you might want to at least print to stdout that you are writing files and print their names.

In terms of seeing how surmise is used, the notebooks seem more powerful and sufficient. You also provide links to them so that users don't even need to run the notebooks.

* Are you also providing the examples as an extra set of code that users can run to test their installation?

The examples mainly point to how surmise is used and they are not meant for testing. But, I listed the examples as tests after the issue we had with Example 3, when we accidentally remove functionality used in examples.

* Do you expect that some users would prefer to run such tests without having to install Jupyter?

I do not know. But Jupyter has become quite the standard now.

* Are you creating the scripts directly from the notebooks?  How significant is the effort to maintain the script and notebook synchronized?

The scripts are generated from the notebooks directly. They are no longer synchronized once changes are made to the scripts directly.

I am now inclined to only maintain the notebooks and request for tests be done in notebooks. I do think having the scripts that developers can run for sanity checks is good.

@jared321
Copy link
Contributor

jared321 commented Sep 5, 2024

As we discussed, adding tests to your notebook and ensuring that they are run as part of a PR review would certainly be good --- your notebooks then function as examples, a secondary test suite, and a secondary means for users evaluating their installations.

When I setup a software environment for a study, I do like to keep it as minimal and clean as possible. I also like to test it well before I start using it. Therefore, if a Python-based study were to be done without notebooks, then I would prefer to avoid installing Jupyter and all its dependences. In such a case, running your example scripts would be my best way forward in terms of testing. That said, it's unlikely that I personally would execute such a study without using Jupyter.

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 5, 2024

Those are great points. I have revised the testing checklist to use Jupyter notebooks instead.

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 5, 2024

(Finalized review checklist)
This major PR is ready for review. I am including the following checklist for surmise team reviewers (@wildsm , @ozgesurer):

  • Copy the following in a new comment for your review.

Functionality tests

  • Follow instruction under Installation to build wheel. Checkout from the v0.3-update branch before building the wheel.
    • This review item will change to simply pip install for BAND reviewers outside the surmise team, once we publish the new version on PyPI.
  • Install surmise v0.2.2+dev..., the wheel built from previous step.
  • Run the test suite under Testing
    • Warnings ConvergenceWarning in testing PCGPwImpute is expected.
  • Run Examples Jupyter notebooks 1 - 4 in examples/Examples*/
    • Warnings about empirical_coverage are expected in Example 3.
    • Example 4 can take significantly longer.
  • Run GP_surmise Jupyter notebook in examples/UsageTutorial/. This reproduces the public colab jupyter notebook.

Documentation review:

@mosesyhc mosesyhc requested review from wildsm and ozgesurer September 5, 2024 16:17
@ozgesurer
Copy link
Collaborator

ozgesurer commented Sep 8, 2024

Functionality tests

  • Follow instruction under Installation to build wheel. Checkout from the v0.3-update branch before building the wheel.

    • This review item will change to simply pip install for BAND reviewers outside the surmise team, once we publish the new version on PyPI.
  • Install surmise v0.2.2+dev..., the wheel built from previous step.

  • Run the test suite under Testing

    • Warnings ConvergenceWarning in testing PCGPwImpute is expected.
        1. I followed the installation steps you provided but encountered an error while running tests: ModuleNotFoundError: No module named 'surmise.emulationsupport.matern_covmat'.
        1. I then tried installing the package using the installation steps from the actions (via tox), and this time, the installation worked and the tests ran without any errors. Everything else seems fine as well.
        1. Out of curiosity, I revisited the step a) and was able to install it successfully.
        1. As a final test, I deleted my existing surmise repository, cloned it from scratch, and attempted the installation following the step a). Unfortunately, I encountered the same error.
        1. @mosesyhc , have you tried cloning surmise from scratch and installing it using the instructions provided?
  • Run Examples Jupyter notebooks 1 - 4 in examples/Examples*/

    • Warnings about empirical_coverage are expected in Example 3.
    • Example 4 can take significantly longer.
  • Run GP_surmise Jupyter notebook in examples/UsageTutorial/. This reproduces the public colab jupyter notebook.

Documentation review:

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 8, 2024

Thank you @ozgesurer.

    * 1. I followed the installation steps you provided but encountered an error while running tests: ModuleNotFoundError: No module named 'surmise.emulationsupport.matern_covmat'.
    * 2. I then tried installing the package using the installation steps from the actions (via tox), and this time, the installation worked and the tests ran without any errors. Everything else seems fine as well.
    * 3. Out of curiosity, I revisited the step a) and was able to install it successfully.
    * 4. As a final test, I deleted my existing surmise repository, cloned it from scratch, and attempted the installation following the step a). Unfortunately, I encountered the same error.
    * 5. @mosesyhc , have you tried cloning surmise from scratch and installing it using the instructions provided?

I did clone surmise from scratch and installing it as in the instructions, and the ModuleNotFoundError did not occur. We have run into this Cython issue for several times and did some changes to setup.py to include the cythonize part.

Would you mind saving the console output and I can investigate further? Thank you.

@ozgesurer
Copy link
Collaborator

Those are the steps that I follow after going to the root directory:

  • python3 -m venv venv/
  • source venv/bin/activate
  • python -m pip install --upgrade pip
  • pip install build Cython
  • pip install scikit-learn
  • python -m build --wheel
  • pip install dist/surmise-0.2.2.dev232+g41fd3a6-cp39-cp39-macosx_10_9_x86_64.whl
  • pip install pytest pytest-cov scikit-learn
  • python -m pytest
    I will print my output in the following message.

@ozgesurer
Copy link
Collaborator

=============================================== test session starts ================================================
platform darwin -- Python 3.9.7, pytest-8.3.2, pluggy-1.5.0
rootdir: /Users/ozgesurer/Desktop/GithubRepos/surmise
configfile: pyproject.toml
plugins: cov-5.0.0
collected 273 items / 1 error

====================================================== ERRORS ======================================================
_______________________ ERROR collecting tests/test_emu_cal/test_cal_directbayeswoodbury.py ________________________
surmise/emulation.py:147: in init
importlib.import_module('surmise.emulationmethods.' + method)
/opt/anaconda3/lib/python3.9/importlib/init.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
:1030: in _gcd_import
???
:1007: in _find_and_load
???
:986: in _find_and_load_unlocked
???
:680: in _load_unlocked
???
:850: in exec_module
???
:228: in _call_with_frames_removed
???
surmise/emulationmethods/PCGPwM.py:8: in
from surmise.emulationsupport.matern_covmat import covmat as __covmat
E ModuleNotFoundError: No module named 'surmise.emulationsupport.matern_covmat'

During handling of the above exception, another exception occurred:
tests/test_emu_cal/test_cal_directbayeswoodbury.py:87: in
emulator_1 = emulator(x=x, theta=theta_lin, f=f_lin, method='PCGPwM')
surmise/emulation.py:149: in init
raise ValueError('Module not loaded correctly.')
E ValueError: Module not loaded correctly.
============================================= short test summary info ==============================================
ERROR tests/test_emu_cal/test_cal_directbayeswoodbury.py - ValueError: Module not loaded correctly.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================ 1 error in 58.47s =================================================

@ozgesurer
Copy link
Collaborator

BTW, @mosesyhc Given that I was able to install using the action's install steps (via tox), maybe it is better to switch the instructions to tox version as well. This might be the quick and secured way of installing.

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 9, 2024

BTW, @mosesyhc Given that I was able to install using the action's install steps (via tox), maybe it is better to switch the instructions to tox version as well. This might be the quick and secured way of installing.

Thank you, @ozgesurer. The main installation will still be through pip and the available wheels.

My assumption is that only developers or few users who do not find a wheel available on PyPI are required to build surmise from scratch. This may be a good rationale to instruct users to use tox if they wish to build surmise.

@jared321, any thoughts?

@jared321
Copy link
Contributor

jared321 commented Sep 9, 2024

I believe that this is related to the existence of the Cython output file matern_covmat.cpython-312-darwin.so in your clone and what folder you run pytests from.

  • The .so file does not exist in a clean repo
  • Building the wheel creates the .so file in a scratch space and packs it into the package
    • Run to confirm that its in the package tar tvfz dist/surmise-0.2.2.dev232+g41fd3a6-cp312-cp312-macosx_14_0_arm64.whl (alter the package name to match your case)
  • Installing the wheel puts the .so file in the installation
    • Run find ./venv/lib/python3.12/site-packages/surmise -name '*.so' adapted to the path of your venv to confirm
  • Running pytest from /path/to/surmise results in the failure for me
  • Running pytest from /path/to/surmise/tests succeeds
  • The .so file does not exist in the clone still
    • Run find ./surmise -name '*.so' to confirm no .so files at all
  • tox -r -e coverage creates the .so file in the clone
    • Rerun find ./surmise -name '*.so' to confirm .so file now exists!
  • Still using the venv venv, running pytest from /path/to/surmise now succeeds.
  • rm ./surmise/emulationsupport/matern_covmat.cpython-312-darwin.so and running pytests from /path/to/surmise fails again

Note that running tox -e nocoverage does not leave the .so file in the clone and indeed fails in a similar way. I suspect that running the coverage task leaves the .so file in a place that must automatically be included in the PYTHONPATH when you run from /path/to/surmise. In particular, Python is likely finding the local .so file automatically and can import it, but its not using the correct .so file that was installed from the wheel. I prefer the source layout over the flat layout because these technical subtleties don't usually pop up.

I do not understand the test suite implementation well enough to understand why you must run it from /path/to/surmise/tests. @mosesyhc Based on the above can you now reproduce the error in your setup? Can you explain why we must run from the tests folder?

All that to say that if the above makes sense the test instructions should just include cd /path/to/surmise/tests before running pytest. Note that this should go away with the changes proposed in Issue #74.

@mosesyhc
Copy link
Member Author

mosesyhc commented Sep 9, 2024

@ozgesurer @jared321: Thank you so much for helping with diagnosis.

I am able to reproduce the error, with @jared321's comments. The pytest command fails in the directory /path/to/surmise/ but succeeds in the directory /path/to/surmise/tests/. At one point in my testing, I wrongly believed that we have resolved the problem, possibly because of some *.so (or *.c / *.pyd files in my case) remaining in the path.

@ozgesurer: would you mind trying if running python -m pytest works inside surmise/tests? I will update the readme, and hopefully that is the last of it (for now)!

@ozgesurer
Copy link
Collaborator

Within the tests directory I was able to run both python -m pytest and ./run-tests.sh without an error. @mosesyhc I can merge if you are ready :)

@mosesyhc mosesyhc merged commit 7b242aa into main Sep 10, 2024
46 checks passed
@mosesyhc mosesyhc deleted the v0.3-update branch September 16, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants