Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deeparg bioconda singularity container is borked #23

Closed
jfy133 opened this issue Mar 3, 2022 · 35 comments
Closed

deeparg bioconda singularity container is borked #23

jfy133 opened this issue Mar 3, 2022 · 35 comments
Labels
bug Something isn't working

Comments

@jfy133
Copy link
Member

jfy133 commented Mar 3, 2022

Description of the bug

Need to to fix it (this module will plague me forever it seems...)

I'm hoping warning and error is related. I'll try and add g++ to the conda recipe under run

❯ cat .command.log
/usr/local/lib/python2.7/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
Traceback (most recent call last):
  File "/usr/local/bin/deeparg", line 7, in <module>
    from deeparg.entry import main
  File "/usr/local/lib/python2.7/site-packages/deeparg/entry.py", line 10, in <module>
    import deeparg.predict.bin.deepARG as clf
  File "/usr/local/lib/python2.7/site-packages/deeparg/predict/bin/deepARG.py", line 12, in <module>
    from lasagne import layers
  File "/usr/local/lib/python2.7/site-packages/lasagne/__init__.py", line 27, in <module>
    import pkg_resources
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3251, in <module>
    @_call_aside
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3235, in _call_aside
    f(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3264, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 574, in _build_master
    ws = cls()
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 567, in __init__
    self.add_entry(entry)
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 623, in add_entry
    for dist in find_distributions(entry, True):
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2065, in find_on_path
    for dist in factory(fullpath):
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2135, in distributions_from_metadata
    root, entry, metadata, precedence=DEVELOP_DIST,
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2592, in from_location
    py_version=py_version, platform=platform, **kw
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2994, in _reload_version
    md_version = self._get_version()
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2772, in _get_version
    version = _version_from_file(lines)
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2556, in _version_from_file
    line = next(iter(version_lines), '')
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2767, in _get_metadata
    for line in self.get_metadata_lines(name):
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1432, in get_metadata_lines
    return yield_lines(self.get_metadata(name))
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1420, in get_metadata
    value = self._get(path)
  File "/usr/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1616, in _get
    with open(path, 'rb') as stream:
IOError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO'

Command used and terminal output

No response

Relevant files

No response

System information

No response

@jfy133 jfy133 added the bug Something isn't working label Mar 3, 2022
@jfy133
Copy link
Member Author

jfy133 commented Mar 3, 2022

Note to self

conda build recipes/deeparg

(to load that newly built environment for testing)

conda create -n deeparg --use-local deeparg

then apparently to build a docker container from built conda env...

conda index /home/jfellows/.conda/conda-bld/deeparg_1646292* ## modified from biocontainers docs but produce some console log for half a sceond?
mulled-build build-and-test 'deeparg' --test 'deeparg -h' -c conda-forge,bioconda,file://home/jfellows/.conda/conda-bld/

specifying the specific verison of deeparg didn't seem to work though

UPDATE

conda index -c ~/.conda/conda-bld/ --verbose
## check inside .conda/conda-bld/noarch for full name 
mulled-build build-and-test 'deeparg=1.0.2--pyh6bb024c_2' --test 'deeparg -h' -c conda-forge,bioconda,file://home/jfellows/.conda/conda-bld/

Doesn't solve the the PKG issue though...

From https://galaxy-lib.readthedocs.io/en/latest/topics/mulled.html#building-docker-containers-for-local-conda-packages

(Note: it seems Biocontainer docs are out of date of the latest mulled toolkit)

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

@Midnighter identified that the ErrNo files are not allowed to be read at all, even when in the docker container?!

bash-4.2# cd /usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/ 
bash-4.2# ls -l
total 48
-rw-r-----    1 root     root         11440 Aug  4  2016 PKG-INFO
-rw-r-----    1 root     root         19460 Aug  4  2016 SOURCES.txt
-rw-r-----    1 root     root             1 Aug  4  2016 dependency_links.txt
-rw-r-----    1 root     root            47 Apr 21  2016 pbr.json
-rw-r-----    1 root     root           112 Aug  4  2016 requires.txt
-rw-r-----    1 root     root             7 Aug  4  2016 top_level.txt

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

Suggestion again from @Midnighter :

update the DerpARG bioconda recipe to patch the files.

So move the pip install command under build, make a dedicated build.sh file, where that command is ran and immediately after change the permissions

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

Maybe chmod +r $PREFIX/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info

$PREFIX from https://github.com/bioconda/bioconda-recipes/blob/b97a5e584ba54a7498f13accd2aa2459d20f8a35/recipes/quast/build.sh

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

First attmpet:

#!/bin/bash
set -eu -o pipefail

python -m pip install --no-deps .

find "$PREFIX" -name '*egg-info'
chmod -R +r "$PREFIX/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info"
ls -lha "$PREFIX/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/"*

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

We suspect the specific version of the recipe of theano used was SO OLD it had broken permissions that with modern tooling don't work (making it too restrictive). So the above should loosen those permissions (we hope)

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

That didn't work either, so going to remove support for DeepaRG for now

@jfy133 jfy133 closed this as completed Mar 9, 2022
@Midnighter
Copy link
Contributor

I was thinking, what if we change the docker.runOptions or singularity one (whichever failed) to run this single process as root?

@jfy133
Copy link
Member Author

jfy133 commented Mar 9, 2022

I was thinking, what if we change the docker.runOptions or singularity one (whichever failed) to run this single process as root?

It was singularity that failed... I dunno... Maybe worth a shot? Could you try it out?

@Midnighter
Copy link
Contributor

Currently, I don't have singularity installed but might get around to it.

@jfy133
Copy link
Member Author

jfy133 commented Mar 10, 2022

Ah ok, I'll see if I can find the relevent options

@jfy133
Copy link
Member Author

jfy133 commented Mar 10, 2022

@jfy133
Copy link
Member Author

jfy133 commented Mar 10, 2022

This does actually work, but not sure how to customise runOptions on a per-process basis

@jfy133 jfy133 reopened this Mar 10, 2022
@jfy133
Copy link
Member Author

jfy133 commented Mar 10, 2022

Lets see if this is possible...

@jfy133
Copy link
Member Author

jfy133 commented Mar 15, 2022

It is with --fakeroot! And on a per-process basis with containerOptions! Thanks and 🍪 to @bentsherman!

@jfy133
Copy link
Member Author

jfy133 commented Mar 16, 2022

Ah, hit a limitation though:

image

@jfy133
Copy link
Member Author

jfy133 commented Mar 16, 2022

We need AWS batch support for the full_tests :\

@jfy133
Copy link
Member Author

jfy133 commented Mar 16, 2022

Ah but it's only singularity is effected, so if we can turn on container options only for singularity runs then that would work 🤔

@Midnighter
Copy link
Contributor

Should be possible with

containerOptions "${workflow.containerEngine == 'singularity' ? '--fakeroot' : ''}"

@jfy133
Copy link
Member Author

jfy133 commented Mar 16, 2022

Aha, workflow. is the trick (are all these variables listed anywhere on hte nf docs?)

@Midnighter
Copy link
Contributor

For workflow you can find it here https://www.nextflow.io/docs/latest/metadata.html, I haven't found a list yet for task introspection.

@bentsherman
Copy link

Support for AWS Batch was only added recently, check the edge docs.

Currently containerOptions is not supported by Kubernetes, Google Life Sciences, or Azure.

@bentsherman
Copy link

Also funny enough I just submitted a Nextflow issue about the task object because it isn't documented: nextflow-io/nextflow#2732

@grst
Copy link
Member

grst commented Mar 16, 2022

--fakeroot is not available on all systems (needs to be set-up properly), so I would be careful with that.

@jfy133
Copy link
Member Author

jfy133 commented Mar 17, 2022

The workaround is in.

@pontus has also kindly pointed a possibly more safe and more portable solution which is to make our own copy of PKG-INFO, and pass that with bind paths

singularity run -B PKG-INFO:/usr/local/lib/python2.7/site-packages/Theano-0.8.2-py2.7.egg-info/PKG-INFO

This is a nice solution as -B is pretty ubiquitous across most versions of singularity AFAIK, however we would need a way to store this file alongside the module and gets staged.

Or we just make that a required input file....?

@grst
Copy link
Member

grst commented Mar 17, 2022

If this is the only file, then yes, that's a perfect solution.
I doubt this file is required, maybe just stage an empty file?

@jfy133
Copy link
Member Author

jfy133 commented Mar 17, 2022

@Midnighter thinks it's required... but I can just dump it onto test-datasets ;)

@Midnighter
Copy link
Contributor

If pkg_resources.get_distribution is happy with parsing an empty file, then an empty file also works. I doubt the actual version string is used anywhere (but I don't know for certain).

@jfy133
Copy link
Member Author

jfy133 commented Mar 17, 2022

Ok maybe I try that... simple touch PKG-INFO...?

EDIT: tomorrow ;)

@pontus
Copy link

pontus commented Mar 17, 2022

Ah, yes, I should have tried that :)

If I run interactively, I can do import lasagne in python as per

Singularity> python
Python 2.7.15 | packaged by conda-forge | (default, Mar  5 2020, 14:56:06) 
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import lasagne
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
/usr/local/lib/python2.7/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
>>> 
>>> 

fine with an empty file (no difference in output compared to an actual copy).

@jfy133
Copy link
Member Author

jfy133 commented Mar 18, 2022

Ok touch PKG-INFO doesn't work because (obviously) container is loaded before script block is executed.

But I will try just executing against multiqc_config.yaml so we don't need to have have a specific file

@jfy133
Copy link
Member Author

jfy133 commented Apr 20, 2022

I think we fixed in the end 👍

@jfy133 jfy133 closed this as completed Apr 20, 2022
@hugolefeuvre
Copy link

I'm currently wrapping the deepARG tool so that it can be integrated into Galaxy, which requires the tool to be tested. These tests are carried out automatically in a Docker container with a Docker pull of the biocontainer of interest (in this case deepARG).
And I'm encountering the exact same error as the one in this issue.
So I was wondering if you can remember how you solved this problem and if you could explain it more precisely because I don't understand how it was solved with the existing comments.
Thanks in advance if anyone comes across it.

@jfy133
Copy link
Member Author

jfy133 commented Dec 20, 2024

Hi @hugolefeuvre !

Basically we ended up just mounting a dummy file (the bash binary of the execution node) to with the name of PKG_INFO to satisfy it:

https://github.com/nf-core/funcscan/blob/master/modules%2Fnf-core%2Fdeeparg%2Fpredict%2Fmain.nf#L10-L18

Not ideal but we've not had any one report issues with it yet

@hugolefeuvre
Copy link

It worked perfectly, thanks @jfy133 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants