Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix checkm subprocess #105

Merged
merged 3 commits into from
Aug 1, 2023
Merged

Fix checkm subprocess #105

merged 3 commits into from
Aug 1, 2023

Conversation

AroneyS
Copy link
Collaborator

@AroneyS AroneyS commented Aug 1, 2023

The command can fail with "No such file or directory" if the database path uses shell stuff.
Could open an injection vulnerability.

@AroneyS AroneyS requested review from wwood and rhysnewell August 1, 2023 02:59
@rhysnewell
Copy link
Owner

I believe that shell=True is used elsewhere throughout aviary but I think I'd like to avoid its use if possible, shell invocation via subprocess is generally frowned upon iirc. Can you provide an example as to what causes the original subprocess command to fail?

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

import subprocess
subprocess.run("ls /")
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "/home/aroneys/m/users/aroneys/.conda/envs/aviary_dev/lib/python3.10/subprocess.py", line 503, in run
#     with Popen(*popenargs, **kwargs) as process:
#   File "/home/aroneys/m/users/aroneys/.conda/envs/aviary_dev/lib/python3.10/subprocess.py", line 971, in __init__
#     self._execute_child(args, executable, preexec_fn, close_fds,
#   File "/home/aroneys/m/users/aroneys/.conda/envs/aviary_dev/lib/python3.10/subprocess.py", line 1863, in _execute_child
#     raise child_exception_type(errno_num, err_msg, err_filename)
# FileNotFoundError: [Errno 2] No such file or directory: 'ls /'

So if the db is under an absolute path, it wont work.

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

Could sanitise the input using shlex.quote(), see https://docs.python.org/3/library/subprocess.html#security-considerations

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

Actually bad example since it doesn't have access to ls anyway

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

I might be misdisagnosing this. Original error was FileNotFoundError: [Errno 2] No such file or directory: 'CHECKM2DB=/mnt/hpccs01/work/microbiome/db/CheckM2_database/uniref100.KO.1.dmnd checkm2 predict -i data/metabat_bins_2// -x fa -o data/metabat_bins_2/checkm2_out -t 64 --force'. So might not be finding checkm2. But shell=True does fix it.

@rhysnewell
Copy link
Owner

Yeah, okay I think this was a problem in the past looking through code. Might be some weird interaction with invoking sub processes from within a conda environment? Like the main python executable is not in the same conda environment as checkm2 so python can't find it until you invoke a shell command?

Copy link
Owner

@rhysnewell rhysnewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like aviary is riddled with shell=True instances and we should probably clean them up using that function you listed previously, so maybe add it to this command as a test to make sure it still works after being cleaned

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

I think its the use of a string input that is the issue. Doing subprocess.run("ls /".split()) works fine. So we could do it using split(), but then we would have to move the env variable outside the subprocess call.
e.g. https://github.com/AroneyS/ibis/blob/48ec215630a8426733d84884730d58aa437353da/test/test_evaluate.py#L332

So not too hard. Do you want me to have a look at the other shell=True instances?

@rhysnewell
Copy link
Owner

I'll sort out the other ones in a different pr, that code in general needs to be cleaned up. A lot of it is legacy stuff from the original slamM snakemake script so there might be some weird cases

@AroneyS
Copy link
Collaborator Author

AroneyS commented Aug 1, 2023

This method works in my environment.

@rhysnewell rhysnewell merged commit aa90293 into dev Aug 1, 2023
@AroneyS AroneyS deleted the fix-checkm-subprocess branch August 2, 2023 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants