-
Notifications
You must be signed in to change notification settings - Fork 659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent behaviour for glob process outputs #2425
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I'm not aware of any changes in consideration of this issue. |
I agree that this can make workflows brittle, can this issue be re-opened? |
I agree with your sentiment. It is poor form for a function to return a different type based on the outcome. Perhaps we could add an option to
The Inspired by numpy.squeeze |
That might be a good compromise if you aren't willing to break backwards compatibility. For my money I would break compatibility. I've not been in the Nextflow game long compared to many others, but I came across this bug the first time I tried to write a non-linear (but still fairly trivial) workflow with a In trivial cases best practice is presumably not to use a glob but more tightly match a single output directly by name (either fixed or through a variable). In code review I would definitely question a:
as being lazy (or a potential error) and request a replacement to something more specific. As soon as you are outside the realm of the trivial, it seems to me you will always want So for me a default of |
The thing is that Nextflow already has a pretty aggressive update schedule, and it is known as such. I think people tolerate it because it's easy to switch Nextflow versions, but even so, we try not to push people herder than we already do. If we implement the |
Duplicate of #1236 |
Well spotted. Here the plan was/is to extend the output (and input) declaration with the cardinality of the expected file to be captured e.g. This would serve both to resolve the ambiguity of the result type and to validate the number of files expected. update: the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Fixed by 42504d3 |
Bug report
Expected behavior and actual behavior
According to, https://www.nextflow.io/docs/latest/process.html#multiple-output-files, a glob pattern can be used to emit a list of multiple items from a process into an output channel. What is not clear from the documentation is that if only a single file matches the glob, a length-1 list is not emitted but rather the plain value.
Returning different types from a function (process in this instance) is typically considered bad practice and burdens the caller with having to perform introspection. In the context of of Nextflow this means any operator on a channel first needs to check the returned type, possibly wrap the item thats assumed to be a list, and then safely perform operations such as mapping over the list.
Steps to reproduce the problem
Program output
(Copy and paste here output produced by the failing execution. Please highlight it as a code block. Whenever possible upload the
.nextflow.log
file.)Environment
Additional context
I can see there being an argument about making a breaking change to Nextflow by enforcing that all globs return a list. Despite Nextflow's eschewing of heavy pattern-matching of file artifacts, I suspect the globbing is obused a lot to emit single files such that a change to emitting lists will break much exisiting code.
However, I do think that at least optionally globbing outputs should be forceable to be lists. I suspect theres some corner cases around what a lenth-0 list might mean and how thats handled. Minimally the documentation needs updating to highlight that the returned type is not always a list.
The text was updated successfully, but these errors were encountered: