Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NanoSim installation failure #162

Open
cjwoodruff50 opened this issue Apr 14, 2022 · 13 comments
Open

NanoSim installation failure #162

cjwoodruff50 opened this issue Apr 14, 2022 · 13 comments

Comments

@cjwoodruff50
Copy link

NanoSim_installation_problems_15April2022.txt

@kmnip
Copy link
Collaborator

kmnip commented Apr 15, 2022

Similar to #161:
If you install from bioconda, it is less likely to have installation issues. For example:

conda create -n nanosim
conda activate nanosim
conda install -c bioconda nanosim

I think that requirements.txt is overly restrictive and it should be updated to something like so:

htseq
joblib
numpy>=1.21.5
pybedtools>=0.8.1
pysam>=0.15.3
scikit-learn>=0.22.1
scipy
six
genometools-genometools
last
minimap2
samtools

@cjwoodruff50
Copy link
Author

cjwoodruff50 commented Apr 15, 2022 via email

@kmnip
Copy link
Collaborator

kmnip commented Apr 15, 2022

Hi Chris,
Can you try installing version 0.21.3 of scikit-learn and re-run your simulation?

conda install scikit-learn=0.21.3

If the simulation doesn't work, then try upgrading scikit-learn to the latest version?

conda update scikit-learn

If that still doesn't work, then I think the pre-trained models need to be updated to work with newer version of scikit-learn. In the mean time, you can also train your own models using public datasets.

@cjwoodruff50
Copy link
Author

cjwoodruff50 commented Apr 17, 2022 via email

@SaberHQ
Copy link
Collaborator

SaberHQ commented Apr 19, 2022

Hi Chris @cjwoodruff50

As @kmnip mentioned, the requirements.txt file is overly restrictive and that might be the reason for your package dependency issue. We will update it to make it easier to install packages using that file.

I am happy to hear that you were able to install NanoSim and make it run. I believe you said you still got a major problem running NanoSim. Please feel free to open another issue ticket and we will be more than happy to help you with that.

Finally, as for the scikit-learn package, version 0.22.1 works for me. You may see my comment (issue #131) on a similar problem here:

Please note that pull request #158 solves this issue by updating the scikit-learn version in requirements.txt

Previous sklearn.neighbors.kde has been renamed to sklearn.neighbors._kde in version 0.22.1. You have probably a version of scikit-learn older than that. Installing the latest release solves the problem:

pip install scikit-learn==0.22.1

For more information and help, please check this stackoverflow question/answer

I am closing this issue. If anyone finds a similar issue, please feel free to reopen it and we will be more than happy to help you. Thanks.

Originally posted by @SaberHQ in #131 (comment)

@cjwoodruff50
Copy link
Author

cjwoodruff50 commented Apr 19, 2022 via email

@fgvieira
Copy link

fgvieira commented Sep 4, 2024

@SaberHQ would it be possible to update the bioconda recipe?

It seems there is a constraint on scikit-learn, but only until version0.3.1 (link).

@lcoombe
Copy link
Member

lcoombe commented Sep 4, 2024

Hi @fgvieira,

Can you clarify what version of NanoSim you are trying to install and what error you are getting, if you are having issues with the installation?

@fgvieira
Copy link

fgvieira commented Sep 5, 2024

@lcoombe I'd like to install the latest (v3.2.0) but it seems nanosim needs scikit-learn <=0.22.1 when using pretrained models. Since the dependencies do not reflect that, @dpryan79 added a patch to bioconda. The problem is that it is only applied to nanosim <= 3.1.0.

And why are the requirements on GitHub:

genometools-genometools
htseq=0.11.3
joblib=1.1.0
last
minimap2=2.17
numpy=1.21.5
pybedtools=0.8.1
pysam=0.15.3
samtools
scikit-learn=0.22.1
scipy=1.7.3
six=1.16.0

not the same as in bioconda:

https://github.com/bioconda/bioconda-recipes/blob/973b7f6a6ac2fe31721674b86e9fdd92f292abbe/recipes/nanosim/meta.yaml#L18-L34

@lcoombe
Copy link
Member

lcoombe commented Sep 5, 2024

Thanks for clarifying your question.
If you want to use the pre-trained models, you can install scikit-learn<=0.22.1 in your conda environment, or specify that when you conda install (conda install nanosim scikit-learn<=0.22.1). We can add a note about that to the README page. We can look into adding another patch to conda, but it becomes a bit overly restrictive for those that are no using the pre-trained models.

As @kmnip and @SaberHQ mentioned above, the requirements.txt is overly restrictive.

@fgvieira
Copy link

fgvieira commented Sep 6, 2024

If nanosim requires scikit-learn<=0.22.1, why not add this requirement directly to the bioconda recipe (as it is specified in the github requirements.txt)?
It would be nice to, at least, have the same set of requirements on both bioconda and github.

@kmnip
Copy link
Collaborator

kmnip commented Sep 6, 2024

If nanosim requires scikit-learn<=0.22.1, why not add this requirement directly to the bioconda recipe (as it is specified in the github requirements.txt)? It would be nice to, at least, have the same set of requirements on both bioconda and github.

Because Nanosim does not require scikit-learn<=0.22.1 to run at all, and it can run with newer versions of scikit-learn. This particular requirement was meant for backward compatibility with the old (i.e. outdated) pre-trained models. Users are more than welcome to install a newer version of scikit-learn and train their own models that are the most suitable with their work. For example, as noted in the readme, the most recent "dorado" model was "trained using NanoSim v3.0.2 with scikit-learn v0.23.2 and python v3.7.10."

This is also written in the README:

NOTE: Please kindly note that the pretrained models in NanoSim (v3.0.2) were made using an older version of scikit-learn (e.g. <=0.22.1). If you have to use these models (instead of creating your own models), then you must use scikit-learn=0.22.1 but not the newer versions. If you have a newer version of scikit-learn installed, then you will get the error for No module named 'sklearn.neighbors.kde'. If you would like to create your own models (instead of using the pretrained models), then NanoSim should work just fine with scikit-learn=1.0.2 from our experience. For future releases of NanoSim, we will try to include newly pre-trained models with the updated versions of required packages in order to solve the incompatibility issues.

@fgvieira
Copy link

fgvieira commented Sep 6, 2024

@kmnip thanks for your detailed reply! It is just a bit confusing, since scikit-learn=0.22.1 is specified as a dependency in requirements.txt.
If NanoSim works with all versions of scikit-learn, shouldn't this version constraint be removed? Or it should match the dependencies on conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants