-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reach https://busco-data.ezlab.org/v5/data/file_versions.tsv #333
Comments
This seems bad. Could you additionally try using --busco_reference or --busco_download_path. That would mean having the files locally and therefore omitting any downloading step. |
Also, please do not use |
I have seen the same error, even when specifying either (--busco_reference "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz") or (--busco_download_path "path/to/bacteria_odb10) |
I am facing the same issue as well |
As @jboktor I can confirm that using --busco_reference or -busco_download_path does not change the outcome. |
Hi, I had a similar problem recently. In my case though it was solvable using |
@skrakau I think thats because --busco_download_path refers to the directory where the busco lineage files are located. It fails to retrieve https://busco-data.ezlab.org/v5/data/file_versions.tsv, which is not among the lineage files. Please correct me if I'm wrong. Regards |
Hi @ChristophKnapp , yes it refers to the directory containing among others a folder with the lineage files, but this should or could also contain a The nf-core/mag parameter Line 42 in a8e92af
which should prevent BUSCO from trying to download anything. That's why I was confused that it still tries to download the file_versions.tsv file, but if the file is missing it probably makes sense that BUSCO fails.
|
Remains the question why the download of the file fails, thus talking to the BUSCO developers might be good anyway. If you create an issue, could you link this here? Otherwise I could also do it next week. |
@skrakau, I would prefer if you would do it. You have more insight in what is going on and understand better on how busco is integrated. Thank you Christoph |
I opened an issue: https://gitlab.com/ezlab/busco/-/issues/593 Feel free to add further details, in case I forgot something. |
Apparently there was a rate limit on the BUSCO server introduced a while ago, which probably caused problems in particular when multiple BUSCO processes were running in parallel and which explains why wget works without problems. This rate limit will be increased. Independently of this, we should update BUSCO to version 5.4.x at some point, which contains a failsafe mechanism that reattempts a connection in case of failure. |
hi @skrakau , the fix works. Thanks! |
FYI (maybe that will help someone with similar issue):
What helped in my case was combination of both:
|
I will close this issue, as the original download issue due to the rate limit was fixed. Feel free to open a new issue if similar issues occur again. @bmlab-sg if your issue remains or re-occurs, please open as well a new separate issue. |
I've just run into this old issue now, with version
It's probably a similar issue with multiple BUSCO jobs attempting to access the URL, and their server blocking new connections after a while:
|
I guess the only solution here is to download the database manually I guess :/, and pass that to the pipeline instead |
Weirdly enough, I tried this now and STILL get the same error. To be more specific, I downloaded the archive with
And this is the contents of
Why is BUSCO still trying to access I'm attaching the full log file, in case it helps: |
Ugh that looks bad... Maybe it always does an internet look up? I've not actually used busco Manually myself... @skrakau if you remember, do you have any ideas? |
Facing the exact same issue, currently testing the |
Please let me know if it works @b-kolar - I started investigating this yesterday at the airport but couldn't finish before had to fly. Otherwise I'll get back to this on Thursday |
I can confirm that the We are testing a modified version of the mag pipeline now, which has so far passed the Busco steps without issues. |
Thank you @b-kolar ! I might ping you when my implementation is ready to make sure we added it roughly in the same way, if that's ok ? |
@jfy133 No problem, feel free to send any questions my way! |
Description of the bug
Hello,
When I start nf-core-mag it runs for some time and then stops with
ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.
See the attached error log. Busco already has an fixed issue with this problem (https://gitlab.com/ezlab/busco/-/issues/567). That's why I post it here first. Tell me to go away if you think they should reopen this issue.
I tried to access https://busco-data.ezlab.org/v5/data/file_versions.tsv with wget and curl and had no problem downloading it from the machine this runs on. Therefore I don't think this is a firewall issue of some sort, but I could be wrong. After all I don't know the exact method how busco is trying this.
I also thought at first that this might be just an internet hickup. So I resumed the analysis after testing whether I could download this file. This was not the case, this will occur every time I resume.
Thanks for your help
Christoph
Command used and terminal output
Relevant files
MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log
MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.txt
System information
N E X T F L O W ~ version 22.04.5
nf-core/mag v2.2.0
Container engine: conda
OS:
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
Hardware: desktop with 128 Gb RAM and 32 cores
The text was updated successfully, but these errors were encountered: