-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possibility to specify plugins within test profiles or non-top scopes #1964
Comments
Maybe it's worth an explanation: the reason why I wanted to load the A related question would be, if it's possible to specify |
To add to this - I was just caught out by the same thing when adding the following to a minimal example plugins {
id 'nf-amazon'
} It triggered the following error:
|
The plugins cannot go in the pipeline script. One workaround would be to add it in the cli when specifying the genome profile e.g.
I know, bit boring |
Or even better if it doesn't have to be defined at all 🙄 😉 I'm pretty worried that these are going to cause chaos for all @nf-core pipelines if I'm honest, with our heavy usage of reference genomes on AWS... |
yeah, but then the problem is when offline, the plugin cannot be downloaded :/ |
When offline the s3 paths can't be downloaded either.. |
I think we can agree on this 😄 |
So is the idea that we define the plugin name in all nf-core pipelines in order to be able to use the AWS-iGenomes references in them? Does that break stuff for anyone wanting to use a different object storage system for their data? (Apologies if this is the wrong place to have this discussion..) |
Putting the plugin name in all nf-core pipelines would break the execution when running in an offline environment because NF would try to download the plugin. What I'm not understanding are the AWS-iGenomes references used by all pipelines? |
Most @nf-core pipelines have the igenomes config (it comes with the pipeline), yeah. Users can configure the base path to use a local directory if all of iGenomes is downloaded, and of course use their own references. But a lot of people just use the AWS-iGenomes directly (based on the download stats anyway). |
Need to check if it's possible to have the |
What I don't really understand is why it needs to be in the pipeline code at all. Ignoring the AWS-iGenomes thing for a minute, surely most of the time this will be something that a user needs to manage rather than a pipeline developer? If it's possible to have all of these plugins installed at once and Nextflow knows how to deal with the s3 paths, can it not just be part of the |
Because the plan is to have application plugins for example to handle SQL db or access a dataset that requires some special library. This is why the requirement is that the pipeline should declare in the pipeline config.
Actually, the a |
Right, I'm not against plugins per se - quite the opposite. I think that it can and will be a super powerful feature. Your examples of pipelines which are fixed to specific data sources are super nice. My objection is for things where the pipeline developer can't know about data sources, primarily file access. Almost all Nextflow pipelines take files as inputs and the beauty of Nextflow being so portable is that they can come from anywhere - local, https, ftp, buckets etc etc. But by mandating that the pipeline developer needs to add ok, so questions:
|
Kind of disagree, because the pipeline is portable irrespective of the platform, but yes the user pulling the data from a cloud should add the required plugin.
well, either command-line options, config file and env variable
yup
Not much because we have already amazon, azure, google cloud. Today we have added dnanexus, and surely more will come. Putting all this stuff as a core dependency would result in a huge bloated runtime that's bad especially when pulling this stuff in the cloud. However, I understand your concern that the user should not care about configuring the plugin needed when launching a nf-core pipeline. This is why I've also added a check that automatically added the required plugins when it's specified the cloud executor e.g. when the executor is I think the only problem that remains to address is when a pipeline launched locally needs to access cloud-stored files. |
Ok great, this puts my mind at ease quite a lot.. 😄
Could you clarify this a bit? My (limited) understanding was that they had to be declared in the pipeline |
Actually, it checks also the pipeline work dir nextflow/modules/nf-commons/src/main/nextflow/plugin/PluginsFacade.groovy Lines 237 to 254 in 914d7cb
Installation != plugin requirement/activation. Installation means only downloading, unzipping and copy the plugin content into the HOME/.nextflow/plugins folder. The installation is done via the
Worth mentioning, they are listed in order of priority ie. if 1 is provided 2 is ignored is specified, and so on.
S3 paths work only if the nf-amazon plugin is requested following the above mechanism. Possible solutions:
|
I've managed to implement the solution at point 4. Therefore cloud plugins are inferred and started automatically in all cases. You may want to give it a try using
Make sure you are using this version
|
Amazing! And to uninstall plugins I can just do |
yep |
This is avail as of version |
New feature
Hi again, I need to come back to the question already mentioned in #1963 about specifying plugins within
test.conf
config files. Currently, with the Nextflow21.03.0-edge
release, when adding thenf-amazon
plugin definition within atest.conf
file (or I guess in any non-top scope) I get:Usage scenario
For some nf-core pipelines the full-size test data is stored on the amazon S3 filesystem and it would be very helpful if the corresponding test profile can still be run on non-aws instances in the future, without the need for the user to specify custom config files extra for this. I do this very often for testing purposes and I think others from the nf-core team would also need this functionality.
The text was updated successfully, but these errors were encountered: