Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function call with channel fails (seemingly) non-deterministically. #2447

Closed
MatthewJM96 opened this issue Nov 15, 2021 · 12 comments
Closed

Comments

@MatthewJM96
Copy link

MatthewJM96 commented Nov 15, 2021

Bug report

Note: reproducibility has been possible, but in my experience hard and non-deterministic, if anyone has any ideas on how to improve on that front please let me know!

Expected behavior and actual behavior

The expected behaviour is, in defining a function such as:

def collect_file_tuples(Map map = [:], file_tuple, suffix, extension) {
    // Can't pass null or false to storeDir for some reason.
    if (map.store_dir) {
        file_tuple.collectFile(
            storeDir: map.store_dir,
            sort: "${map.sort != null ? map.sort : 'hash'}"
        ) { item ->
            [ "${item[0]}${suffix}.${extension}", item[map.file_idx ?: 1] ]
        }.map{
            file -> [file.simpleName.substring(0, file.simpleName.length() - suffix.length()), file]
        }
    } else {
        file_tuple.collectFile(
            sort: "${map.sort != null ? map.sort : 'hash'}"
        ) { item ->
            [ "${item[0]}${suffix}.${extension}", item[map.file_idx ?: 1] ]
        }.map{
            file -> [file.simpleName.substring(0, file.simpleName.length() - suffix.length()), file]
        }
    }
}

that a call such as:

collect_file_tuples(
    some_process.out.reads, "processed", "fq",
    file_idx: 1,
    sort: true
).set{ processed_reads }

will see that function executed on the channel passed in as an argument.

What happens, however, is a seemingly non-deterministic error of "wrong number of arguments" (Nextflow log attached). The runs seem to work and not work with no rhyme or reason - the exact same code will sometimes run and sometimes not run even in the case of the .nextflow cache having been deleted.

I was previously running on Nextflow 21.04.3, and get the same result on 21.10.0.

Steps to reproduce the problem

The occasion of the error showing up is proving more frequent than times it does not in my full codebase, the following replicates everything leading up to the error, but the error appears via this code much less frequently (I only got the error twice in ~20 runs, each time the same code, and the nextflow cache deleted):

test_func.nf

def test(Map map = [:], file_tuple, suffix, extension) {
    if (map.store_dir) {
        file_tuple.collectFile(
            storeDir: map.store_dir,
            sort: "${map.sort != null ? map.sort : 'hash'}"
        ) {
            item ->
                [ "new_name${suffix}.${extension}", item[map.file_idx ?: 1] ]
        }.map {
            file ->
                [ file.simpleName.substring(0, file.simpleName.length() - suffix.length()), file ]
        }
    } else {
        file_tuple.collectFile(
            sort: "${map.sort != null ? map.sort : 'hash'}"
        ) {
     	     item ->
                [ "new_name${suffix}.${extension}", item[map.file_idx ?: 1] ]
        }.map {
	    file ->
                [ file.simpleName.substring(0, file.simpleName.length() - suffix.length()), file ]
        }
    }
}

test.nf

nextflow.enable.dsl = 2

include { some_long_process } from './process.nf'

include { test } from './test_func.nf'

workflow impl {
take:
    reads
main:
    some_long_process(reads)

    test(
        some_long_process.out.answer, '.hello', 'world',
        sort: true, file_idx: 1
    ).set{ answer_ch }

    answer_ch.view()
emit:
    answer = answer_ch
}

workflow {
main:
    Channel.fromFilePairs("${projectDir}/reads/*_R{1,2}.fastq.gz", flat:true)
           .splitFastq(by:400000, pe:true, file:true)
           .set{ reads }

    impl(reads)
emit:
    answer = impl.out.answer
}

process.nf

process some_long_process {
    label 'in_container'

    input:
    tuple val(id), file(read), file(paired_read)

    output:
    path "${id}.log", emit: log
    tuple val(id), file("${id}.txt"), emit: answer

    executor = 'lsf'
    cpus = 1
    queue = '<some queue>'
    clusterOptions = 'span[hosts=1]'
    time '10m'

    """
    sleep 3
    echo "this is a log" >> ${id}.log
    echo $read >> ${id}.txt
    echo $paired_read >> ${id}.txt
    echo "$id + 3 = 4" >> ${id}.txt
    """
}

test.nfconfig

singularity {
    enabled = true
    autoMounts = true
}
process {
    withLabel: in_container {
        container = "file://${projectDir}/a-container-with-bash.simg"
    }
}

The only concrete thing I can say is that whenever I supply NXF_DEBUG=3 to my fuller codebase - which fails almost every time otherwise - it will succeed nearly every time.

I don't know that the container, or having the process & function each defined in a separate file is necessary, but I have only obtained the error with in this form. The rate on this test case of the error occurring is much lower though, and I tested ~20 runs (fresh cache each time) with and without using the container, I got two failures and both only in the case of running the process in the container.

Program output

stdout

N E X T F L O W  ~  version 21.10.0
Launching `./workflows/part/preparatory.nf` [drunk_shaw] - revision: 7312136437
wrong number of arguments

 -- Check script 'workflows/part/preparatory.nf' at line: 24 or see '.nextflow.log' file for more details

[-        ] process > preparatory:preparatory_impl:trimmomatic -

.nextflow.log

Nov-15 15:22:44.556 [main] DEBUG nextflow.cli.Launcher - $> ./exe/nextflow-21.10.0-all -C ./workflows/config/preparatory.nfconfig run ./workflows/part/preparatory.nf -entry preparatory
Nov-15 15:22:44.670 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 21.10.0
Nov-15 15:22:44.712 [main] INFO  nextflow.cli.CmdRun - Launching `./workflows/part/preparatory.nf` [drunk_shaw] - revision: 7312136437
Nov-15 15:22:44.729 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /lustre/scafellpike/local/HT04000/mxm11/shared/dsl2_nextflow_test/workflows/config/preparatory.nfconfig
Nov-15 15:22:44.753 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
Nov-15 15:22:45.814 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; plugins-dir=/gpfs/fairthorpe/local/HT04000/mxm11/mxm56-mxm11/.nextflow/plugins
Nov-15 15:22:45.816 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
Nov-15 15:22:45.833 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Nov-15 15:22:45.834 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Nov-15 15:22:45.838 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
Nov-15 15:22:45.849 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
Nov-15 15:22:45.903 [main] DEBUG nextflow.Session - Session uuid: 0e181676-b7d7-4c6e-91f1-436ff11088c0
Nov-15 15:22:45.903 [main] DEBUG nextflow.Session - Run name: drunk_shaw
Nov-15 15:22:45.904 [main] DEBUG nextflow.Session - Executor pool size: 64
Nov-15 15:22:45.915 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 21.10.0 build 5641
  Created: 11-11-2021 18:37 UTC (18:37 BST)
  System: Linux 3.10.0-1127.el7.x86_64
  Runtime: Groovy 3.0.9 on OpenJDK 64-Bit Server VM 1.8.0_242-b08
  Encoding: UTF-8 (UTF-8)
  Process: 158384@hcxlogin2 [172.31.204.62]
  CPUs: 64 - Mem: 125.3 GB (5.2 GB) - Swap: 7.8 GB (7.7 GB)
Nov-15 15:22:45.951 [main] DEBUG nextflow.Session - Work-dir: /lustre/scafellpike/local/HT04000/mxm11/shared/dsl2_nextflow_test/work [nfs]
Nov-15 15:22:45.952 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /lustre/scafellpike/local/HT04000/mxm11/shared/dsl2_nextflow_test/workflows/part/bin
Nov-15 15:22:46.040 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[GoogleLifeSciencesExecutor, AwsBatchExecutor, IgExecutor]
Nov-15 15:22:46.054 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
Nov-15 15:22:46.056 [main] DEBUG nextflow.Session - Observer factory: TowerFactory
Nov-15 15:22:46.107 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 65; maxThreads: 1000
Nov-15 15:22:46.254 [main] DEBUG nextflow.Session - Session start invoked
Nov-15 15:22:46.591 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Nov-15 15:22:46.978 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 65; maxThreads: 1000
Nov-15 15:22:47.130 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:normal_16c_1n` matches labels `normal_16c_1n,container_trimmomatic` for process with name preparatory:preparatory_impl:trimmomatic
Nov-15 15:22:47.131 [main] DEBUG nextflow.script.ProcessConfig - Config settings `withLabel:container_trimmomatic` matches labels `normal_16c_1n,container_trimmomatic` for process with name preparatory:preparatory_impl:trimmomatic
Nov-15 15:22:47.136 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: lsf
Nov-15 15:22:47.136 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'lsf'
Nov-15 15:22:47.144 [main] DEBUG nextflow.executor.Executor - [warm up] executor > lsf
Nov-15 15:22:47.150 [main] DEBUG n.processor.TaskPollingMonitor - Creating task monitor for executor 'lsf' > capacity: 100; pollInterval: 5s; dumpInterval: 5m 
Nov-15 15:22:47.154 [main] DEBUG n.executor.AbstractGridExecutor - Creating executor 'lsf' > queue-stat-interval: 1m
Nov-15 15:22:47.157 [main] DEBUG nextflow.executor.LsfExecutor - [LSF] Detected lsf.conf LSF_UNIT_FOR_LIMITS=MB
Nov-15 15:22:47.242 [main] DEBUG nextflow.Session - Session aborted -- Cause: wrong number of arguments
Nov-15 15:22:47.255 [main] DEBUG nextflow.Session - The following nodes are still active:
  [operator] splitFastq
  [operator] collectFile
  [operator] view

Nov-15 15:22:47.357 [main] ERROR nextflow.cli.Launcher - wrong number of arguments
java.lang.IllegalArgumentException: wrong number of arguments
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at nextflow.script.FunctionDef.invoke_a(FunctionDef.groovy:57)
	at nextflow.script.ComponentDef.invoke_o(ComponentDef.groovy:41)
	at nextflow.script.WorkflowBinding.invokeMethod(WorkflowBinding.groovy:94)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeOnDelegationObjects(ClosureMetaClass.java:408)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:350)
	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:61)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:212)
	at Script_27fe6a45$_runScript_closure1$_closure3.doCall(Script_27fe6a45:24)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:406)
	at nextflow.script.WorkflowDef.run0(WorkflowDef.groovy:186)
	at nextflow.script.WorkflowDef.run(WorkflowDef.groovy:170)
	at nextflow.script.BindableDef.invoke_a(BindableDef.groovy:52)
	at nextflow.script.ComponentDef.invoke_o(ComponentDef.groovy:41)
	at nextflow.script.WorkflowBinding.invokeMethod(WorkflowBinding.groovy:94)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeOnDelegationObjects(ClosureMetaClass.java:408)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:350)
	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:61)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:185)
	at Script_27fe6a45$_runScript_closure2$_closure7.doCall(Script_27fe6a45:63)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:406)
	at nextflow.script.WorkflowDef.run0(WorkflowDef.groovy:186)
	at nextflow.script.WorkflowDef.run(WorkflowDef.groovy:170)
	at nextflow.script.BindableDef.invoke_a(BindableDef.groovy:52)
	at nextflow.script.ChainableDef$invoke_a.call(Unknown Source)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
	at nextflow.script.BaseScript.runDsl2(BaseScript.groovy:191)
	at nextflow.script.BaseScript.run(BaseScript.groovy:200)
	at nextflow.script.ScriptParser.runScript(ScriptParser.groovy:221)
	at nextflow.script.ScriptRunner.run(ScriptRunner.groovy:212)
	at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:120)
	at nextflow.cli.CmdRun.run(CmdRun.groovy:309)
	at nextflow.cli.Launcher.run(Launcher.groovy:480)
	at nextflow.cli.Launcher.main(Launcher.groovy:639)
Nov-15 15:22:47.360 [Actor Thread 17] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null

Environment

  • Nextflow version: 21.10.0
  • Java version:
    • openjdk version "1.8.0_242"
    • OpenJDK Runtime Environment (build 1.8.0_242-b08)
    • OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
  • Operating system: linux
  • Bash version: 4.2.46(2)-release
@tverbeiren
Copy link

I encounter a similar issue: intermittent errors wrong number of arguments for a relatively simple pipeline project that pop up for an unknown reason.

@pditommaso
Copy link
Member

Having a self-contained example to replicate the issue would help a lot here.

@tverbeiren
Copy link

tverbeiren commented Feb 9, 2022

In my case the problem was caused by defining a function with a default argument value.

This is probably related to: #1698

@SamThePsychoticLeprechaun I noticed you have some default argument values in your function definition as well, I suggest you try and remove those to see if this resolves the issue for you?

I must say that other people working on the same pipeline have not observed this issue at all, maybe because of a different Java runtime?

@pditommaso
Copy link
Member

I would exclude a side-effect of different Java runtime version, never seen this happening before.

@MatthewJM96
Copy link
Author

Hi, I ran the case where this occurs with high frequency 30 times with the default argument excluded on the map parameter. I got no error on any of the runs and I got the correct outputs. When I added the default argument back in, I immediately got the same error as described earlier. Thanks for the suggestion @tverbeiren.

@pditommaso, I'm not super familiar with Java/Groovy but it looks to me this is a common pattern to use a map for named parameters that may not be needed, and I understand in 2020 your statement on #1698 was that default arguments are not supported. Perhaps this could be reconsidered, as it will contribute towards achieving the goal DSL2 has of enabling better software engineering practices in Nextflow.

@cjw85
Copy link
Contributor

cjw85 commented Jun 29, 2022

We have this too when we tried to implement optional parameters into existing shared code to avoid breaking the API. We also see non-deterministic errors: sometimes the code runs fine, other times it complains: wrong number of arguments.

@SamThePsychoticLeprechaun you are correct that putting a Map as the first argument to consume and collect named arguments is a common idiom; it is highlighted in the groovy-lang documentation.

The Nextflow documentation states:

The Nextflow scripting language is an extension of the Groovy programming language.

This is misleading: its clear that Nextflow script isn't a superset of Groovy. The semantics of function calls in Nextflow workflow script are not the same as Groovy.

The following illustrates the point:

nextflow.enable.dsl = 2

def makeChannel(Map args=[:], arg1, arg2) {
    args = [arg3:0] + args
    return [arg1, arg2, args.arg3]
}

println makeChannel(1,2)
println makeChannel(1,2,arg3:3)

workflow {
    main:
        println makeChannel(1,2)
        println makeChannel(1,2,arg3:3)
}

The above leads to the following output:

N E X T F L O W  ~  version 22.04.4
Launching `main.nf` [soggy_mirzakhani] DSL2 - revision: 59de25e664
[1, 2, 0]
[1, 2, 3]
wrong number of arguments

 -- Check script 'main.nf' at line: 19 or see '.nextflow.log' file for more details

The first position Map args=[:] isn't supported in the workflow context it seems.

We've found two workarounds to this. Firstly abandon Nextflow script entirely and write a "standard" Groovy class with static methods:

main.nf:

nextflow.enable.dsl = 2

workflow {
   main:
       chan3 = MakeChannel.make(1, 2, arg3:3)
       chan3.view()
       chan0 = MakeChannel.make(1, 2)
       chan0.view()
}

lib/MakeChannel.groovy

import nextflow.Channel
import groovyx.gpars.dataflow.DataflowWriteChannel

class MakeChannel {
    public static DataflowWriteChannel make(Map args=[:], arg1, arg2) {
        args = [arg3:0] + args
        return Channel.from([arg1, arg2, args.arg3])
    }
}

This appears to work more robustly. I do not know how you would call a Nextflow process within this method however.

The other workaround is rather hilarious, and I wouldn't recommend it. Write functions with a single Map argument! This necessitates validating the presence of requirement arguments.

nextflow.enable.dsl = 2

def MakeChannel(Map args) {
    required = ["arg1", "arg2"]
    if (!required.every{it -> args.containsKey(it)}) {
        throw new Exception("Provide arguments: ${required}")
    }
    nargs = [arg3:0] + args
    return Channel.from([nargs.arg1, nargs.arg2, nargs.arg3])

}

workflow {
    main:
        chan3 = MakeChannel([arg1:1, arg2:2, arg3:3])
        chan3.view()
        chan0 = MakeChannel([arg1:1, arg2:2])
        chan0.view()
        chan = MakeChannel([arg1:1])
        chan.view()
}

This allows you to stick with writing simple Nextflow function definitions but is slightly silly!

@pditommaso
Copy link
Member

pditommaso commented Jun 30, 2022

This is a problem that has been recently identified. In a nutshell, function overloading i.e. two or more functions with the same name and a different signature is not supported by nextflow.

When using a default parameter, is essentially the same as creating two versions of the same function with and without that parameter.

Recent versions of nextflow, detect this problem and report a warning message c0b522ab

@jorgeaguileraseqera
Copy link
Contributor

I think @cjw85 approach it's a good way to work around the issue but I think it's better to "separate" the business logic (make function in this case) from the workflow logic (channels, process, etc)

For example, we can create a Functions.groovy library with methods as:

class Functions {

    // returns a list using an item of the map
    def make(Map args=[:], arg1, arg2) {
        args = [arg3:0] + args
        [arg1, arg2, args.arg3]
    }

    // returns a default list
    def make(){
       ['default value 1', 'default value 2']
    }
    
    // returns a list of a single value or default if value not provided
    def make( obj ){
       obj ? "$obj".split(",") : make()
    }

}

These functions don't need to be static and we can split the logic into several classes/methods because it's a "pure" groovy class
(as a plus you can debug and test this class outside nextflow using groovy command line for example)

the main.nf :

nextflow.enable.dsl = 2

def func = new Functions()

workflow {
    main:
        list = func.make()
        chn = Channel.from( list )
	chn.view()
}

we instantiate a Functions object (I've called func but can be whatever)

a more elaborate workflow can be:

nextflow.enable.dsl = 2

def func = new Functions()

workflow fixedValues {
    main:
        list = func.make(1, 2, arg3:3)
        chn = Channel.from( list )
	chn.view()
}

workflow defValues {
    main:
        list = func.make()
        chn = Channel.from( list )   
        chn.view()

}

workflow{
   main:
      fixedValues()
      defValues()
}

or for example a workflow who create the list from the params

nextflow.enable.dsl = 2

def func = new Functions()

workflow {
    main:
        list = func.make(params.input)
        chn = Channel.from( list )
	chn.view()
}

in this case we are calling the make( obj ) who decides if return a list or the default values, for example

@cjw85
Copy link
Contributor

cjw85 commented Jul 20, 2022

but I think it's better to "separate" the business logic (make function in this case) from the workflow logic (channels, process, etc)

I wouldn't get hung up on the examples I gave that contructed channels within the functions; that was just the first thing I wrote to illustrate the point that was along the same lines as the OP. Furthermore I think its perfectly valid in more complex cases to want to create a channel from within a library function, say after having run a process to grab data from an external source.

Separating business logic from workflow logic isn't the goal here, its to have a way to emulate the type of function polymorphism afforded by the Groovy def func( Map args[:], ...) args idiom, or the (somewhat cleaner) Python style of programming using positional and keyword arguments for required and optional arguments: def func(arg1, arg2, arg3=42).

As @pditommaso points out, function overloading isn't supported in the Nextflow codebase: there is a soft assumption when trying to lookup a function that there is only one of the given name. @SamStudio8 did the codedive for us a few weeks back, incidentally uncovering why for the OP the failures were non-deterministic: the function chosen to be called is pulled from a data stucture with no stable ordering, sometimes the function picked will have the correct signature to be called with the required arguments, sometimes not.

In the end we worked around the issue by implementing a library function to perform Nextflow function argument validation for us along the lines of the technique I suggested previously was a bit silly; it at least allows us to create reusable library functions with default argument values. For example:

import ArgumentParser

def my_func(Map stuff){
    def p = new ArgumentParser(
        args:["input"],
        kwargs:["opt1":null, "opt2":1],
        name:"my_func")  // required just for logging, and introspecting the callstack to get it seems impossible
    Map margs = p.parse_args(stuff)
    // margs has members named as all the args ans kwargs above
    ...
}

where ArgumentParser is: ArgumentParser.groovy. Its somewhat goofy but:

  1. the required and optional arguments are documented close to the function signature,
  2. validation is performed on the argument list, so the worst of uninterpretable runtime errors are avoided,
  3. no additional knowledge of Groovy is required by workflow developers.

@bentsherman
Copy link
Member

Is this issue fixed by #3052 ?

@cjw85
Copy link
Contributor

cjw85 commented Aug 9, 2022

I can't see a test in that PR that implements the pattern used by the OP, seems wise.

@pditommaso
Copy link
Member

This was first addressed by this commit, and later allowing the full support for function overloading via #3052

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants