dockerHSSM #439

panwanke · 2024-05-21T10:20:11Z

Add a dockerfile for dockerizing the HSSM and a docker-image.yml for allowing github action.
I tested the all notebooks in tutorials folder, and fix some issues.

digicosmos86

Thank you for this PR! We generally discourage our users from using docker, especially since they can easily install hssm from conda-forge now, but there could be some special use cases for docker. I've suggested some changes. Since we can't see the diffs of the notebooks, can you let us know what are the changes and fixes to the notebooks?

Thanks again for contributing!

.github/workflows/docker-image.yml

Dockerfile

digicosmos86 · 2024-05-21T13:38:37Z

Dockerfile

+
+USER $NB_UID
+
+RUN conda config --add channels  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/


This might be an issue for most of the users. Any reason why we cannot use conda-forge?

Oh, and that's because in my tests in China, using this proxy to prevents installation failures. Our previous experience building dockerHDDM tells me that this being applied to github should work as well. Of course, one can also use the original forge and should get the same results.

After trying it out, I realised that using conda would take hours, whereas using pip was significantly faster, so I ended up using pip.

In our experience, we encountered an issue with the pip-installed pytensor not being able to find the blas library, resulting it falling back to the numpy C-API and much slower performance overall. That's why we worked to put hssm on PyPI. Is there any issue like this with this docker image?

I'm not sure if we installed pytensor correctly. But using the following code, I found that pytensor should recognize the blas library.

import pytensor print(pytensor.config.blas__ldflags) # -L/opt/conda/lib -llapack -lblas -lcblas -lm -Wl,-rpath,/opt/conda/lib

Please let me know if there are other better testing methods to ensure that pytensor is installed correctly

Dynamically fetch versions; and perform github action either at release time or manually.

add workflow_dispatch

update: install toml

panwanke · 2024-05-23T00:40:06Z

Thank you for your reply. I am reporting the issues and changes in the notebooks below. These issues have been discussed with Alex and any help needed is appreciated.

obs_angle = ssms.basic_simulators.simulator(...) --> simulatorobs_angle = ssms.basic_simulators.simulator.simulator(...)inlikelihoods.ipynb`

Replace all config.update("jax_enable_x64", False) and pytensor.config.floatX = "float32" with hssm.set_floatX("float32")

Glad to see you guys updated your compatibility with jax config a few hours ago. And I still suggest if it is possible to use hssm.set_floatX consistently in all notebooks?

Remove missing api in main_tutorial.ipynb:

from hssm.distribution_utils import (
    make_distribution,  # A general function for making Distribution classes
    make_distribution_from_onnx,  # Makes Distribution classes from onnx files
    make_distribution_from_blackbox,  # Makes Distribution classes from callables
)

to

from hssm.distribution_utils import make_distribution

There is a compatibility issus with latests pymc(5.15) that makes .graph() not work. So, in the docker env, I fix pymc version to 5.14.

The major modifications are the file path references.

Firstly, cavanagh_theta_test.csv and cavanagh_idata.nc were referenced in plotting.ipynb, and they were in the tests/fixtures folder. So I mapped that folder to the docker's src directory as shown below. This allows you to reference this data via src/cavanagh_theta_test.csv.
Secondly, given that model files such as *.onnx exist under fixtures and may not be downloadable on some network conditions, to facilitate the use of these models in docker, I added loglik="src/*onnx" to all places where onnx is needed, including pymc.ipynb, likelihoods.ipynb and main_tutorials.ipynb. such as

model_model_comp_1 = hssm.HSSM(
    data=dataset_model_comp,
    model="angle",
    loglik="src/angle.onnx",
    a=1.0,
)

digicosmos86 · 2024-05-23T14:47:26Z

@panwanke Thank you again for your help with this! We really appreciate the effort in fixing the inconsistencies in our notebooks! In principle I agree with all the changes but the last one. We want to keep the notebooks directly executable in Google Colab, so switching to a local reference to the onnx files might break that. Maybe you can download all networks from the huggingface repo to the docker image, and then add references (commented out) to the onnx files locally? People can then uncomment the code when there is no Internet

panwanke · 2024-05-24T02:20:09Z

@panwanke Thank you again for your help with this! We really appreciate the effort in fixing the inconsistencies in our notebooks! In principle I agree with all the changes but the last one. We want to keep the notebooks directly executable in Google Colab, so switching to a local reference to the onnx files might break that. Maybe you can download all networks from the huggingface repo to the docker image, and then add references (commented out) to the onnx files locally? People can then uncomment the code when there is no Internet

Thanks for the suggestion. I'm sorry I didn't think about running in colab before. I've now commented out where I need to use onnx and added an explanation. For example, like this,

ddm_model_approx_diff = hssm.HSSM(
    data, model="ddm", 
    loglik_kind="approx_differentiable",
    loglik="ddm.onnx",          # will be downloaded from huggingface
    # loglik="src/ddm.onnx", # If running in dockerHSSM, the local file is available
)

panwanke added 2 commits May 21, 2024 17:34

add dockerfile and github action yml

d5e43ad

update notebooks for dockerHSSM

1bd8b70

digicosmos86 requested changes May 21, 2024

View reviewed changes

panwanke added 5 commits May 22, 2024 09:56

Update docker-image.yml

0fadf0d

Dynamically fetch versions; and perform github action either at release time or manually.

simplify the dockerfile

ecd520f

Update docker-image.yml

eb474f2

add workflow_dispatch

Update docker-image.yml

f7d4e77

update: install toml

based on pip

fd6b80c

commented onnx local reference and add explanation.

19c40a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dockerHSSM #439

dockerHSSM #439

panwanke commented May 21, 2024

digicosmos86 left a comment

digicosmos86 May 21, 2024

panwanke May 22, 2024

digicosmos86 May 23, 2024

panwanke May 24, 2024

panwanke commented May 23, 2024

digicosmos86 commented May 23, 2024

panwanke commented May 24, 2024 •

edited

Loading


		USER $NB_UID

		RUN conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/

dockerHSSM #439

Are you sure you want to change the base?

dockerHSSM #439

Conversation

panwanke commented May 21, 2024

digicosmos86 left a comment

Choose a reason for hiding this comment

digicosmos86 May 21, 2024

Choose a reason for hiding this comment

panwanke May 22, 2024

Choose a reason for hiding this comment

digicosmos86 May 23, 2024

Choose a reason for hiding this comment

panwanke May 24, 2024

Choose a reason for hiding this comment

panwanke commented May 23, 2024

digicosmos86 commented May 23, 2024

panwanke commented May 24, 2024 • edited Loading

panwanke commented May 24, 2024 •

edited

Loading