-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] allow null
in integer format when multiple: true
#705
Comments
Hey @kverstae ! Thanks for creating this issue! I've been thinking about this for a bit, specifically how to implement such functionality without rewriting everything from scratch and introducing very different behaviours and functionality. Since values are passed in a delimiter-based way (e g. Delimiter is
We'd have to look into how each value gets translated into each of the different languages. In a follow-up post, we should outline a cross comparison of different examples and how it gets parsed into each of the different languages so we can build a unit test out of it. Would that satisfy your needs @kverstae ? |
Thanks for taking a look at this @rcannood Your proposal would indeed satisfy my needs. I personally am not a big fan of the trailing delimiter. If I would just look at For strings, you could argue that when multiple is true and you get something like |
This comment was marked as outdated.
This comment was marked as outdated.
Just two questions to clarify:
A general remark regarding making breaking changes: I think its OK on the condition that we introduce a viash config value to override the 'not set' value (as suggested in #619 (comment)) because this would allow setting this config value to avoid making changes to the scripts. In |
Indeed!
With the latest proposal:
Come to think of it, being able to distinguish between Maybe I prefer |
I concur with the remark from @kverstae that the use of the separator is counterintuitive here. Am I overlooking something when I say that this could also work?
I'm not in favour of adding another keyword. I think we will be introducing breaking changes anyway, and introducing fewer breaking changes does not warrant the complexity overhead of adding another keyword. |
Yes, but by adding All kidding aside, we could take your quote idea and extend it
This allows users to pass the 'UNDEFINED' string by escaping it:
Furthermore, we should then also allow escaping separator, quotes and slashes:
However, what happens to characters which need to be escaped, when the string is not surrounded by quotes?
|
OHNOES! I came across two more edge cases Edge case: List of missing valueWe need to be able to distinguish between I think the only thing we can do is define another keyword Edge case: are optional arguments (with
|
This comment was marked as outdated.
This comment was marked as outdated.
I would personally prefer A question: when you pass Why would we use Am I right in thinking that Footnotes
|
From the perspective of the shell, I find you example counterintuitive because:
Also, a second issue arises from your poposed syntax: how to specify an empty list?
It does (see examples above). As an extra note, you can pass whatever you would like given the proper escaping
The reason why we need two values is because of an edge case Robrecht described above: we need to be able to distinguish between Given only one value
There is no real way to define |
Hence my suggestion:
For me, the |
I was at first strongly opposed to using the same name ( I don't really like I propose we borrowing from terminology used in other programming languages. For example:
Taking into account the popularity of each of these languages, I guess Happy to pick this up during a frodo to discuss! |
|
Final proposal: I'm going to update my previous post with the latest updates. Problem statementThere are multiple way of specifying arguments -- either from the CLI, in Groovy, or as a param_list csv / yaml. Right now, passing arguments directly in Groovy or in a YAML param_list offers the greatest flexibility in distinguishing between things:
This is due to the command-line interface having to pass values as strings (e.g. ExampleLet us take the following list of arguments: name: mycomp
arguments:
- type: integer
name: --optional_integer
default: 42
required: false
- type: string
name: --multiple_string
required: false
multiple: true
allow_UNDEFINED_ITEM: true # new field added (I could add all possible variants of required arguments, optional arguments, multiple arguments; for types string, integer, double, boolean, but it would be just more of the same). Desired parameter setsLet's say we want our channel to contain three events: [
[ "event0", [optional_integer: 42, multiple_string: ["a", null, "c"] ],
[ "event1", [optional_integer: 44, multiple_string: null] ],
[ "event2", [optional_integer: null, multiple_string: [] ],
[ "event3", [optional_integer: 42, multiple_string: [null]] ]
] Parameter ingestion variationsIn Nextflow / GroovyIn Nextflow / Groovy, this is already possible. We can just create a channel with exactly these parameters and presto. However, to ensure that the default value of Channel.fromList([
// optional_integer can be omitted, since it's the default
[ "event0", [multiple_string: ["a", null, "c"] ],
// multiple_string can be omitted, since the default is already what we want it to be
[ "event1", [optional_integer: 44] ],
// explicitly unset `optional_integer`, and explicitly set `multiple_string` to an empty list
[ "event2", [optional_integer: null, multiple_string: [] ],
// optional_integer can be omitted
[ "event3", [multiple_string: [null]] ]
])
| mycomp.run(
fromState: ["optional_integer", "multiple_string"],
toState: [...]
) As a yaml param listAll Viash workflows can also ingest a parameter list (csv, json, or yaml) using the
param_list:
- id: event0
multiple_string: ["a", null, "c"]
- id: event1
optional_integer: 44
- id: event2
optional_integer: null
multiple_string: []
- id: event3
multiple_string: [null] and then running the component with As a csv param listHowever, using a csv based parameter list causes issues here, because we can define the following csv:
This introduces different notations.
Note that other combinations of escaped characters (e.g. From the CLIWhen calling an executable from the CLI, we can apply the same parsing as with the CSV, except the user can also call the same argument multiple times. Note that the examples below only show Specifically, the following should be valid: # event 0
target/executable/mycomp/mycomp \
--multiple_string a --multiple_string UNDEFINED_ITEM --multiple_string c
# event 1
target/executable/mycomp/mycomp \
--optional_integer 44
# event 2
target/executable/mycomp/mycomp \
--optional_integer UNDEFINED \
--multiple_string ""
# event 3
target/executable/mycomp/mycomp \
--multiple_string UNDEFINED_ITEM However, the following should also be valid # event 0 alternative
target/executable/mycomp/mycomp \
--multiple_string "a;UNDEFINED_ITEM;c"
# event 0 yet another alternative
target/executable/mycomp/mycomp \
--multiple_string "a" --multiple_string "UNDEFINED_ITEM;c" And the latter should also work using the Nextflow CLI: # event 0
nextflow run . -main-script target/nextflow/mycomp/main.nf \
--multiple_string "a;UNDEFINED_ITEM;c" How it's interpreted in the scriptsR# event 0
par <- list(optional_integer = 42L, multiple_string = c("a", NA_character_, "c"))
# event 1
par <- list(optional_integer = 44L, multiple_string = NULL) # breaking change
# event 2
par <- list(optional_integer = NULL, multiple_string = character(0))
# event 3
par <- list(optional_integer = 42L, multiple_string = c(NA_character_)) Python# event 0
par = {"optional_integer": 42, "multiple_string": ["a", None, "c"]}
# event 1
par = {"optional_integer" : 44, "multiple_string": None} # breaking change
# event 2
par = {"optional_integer": None, "multiple_string": []} # breaking change
# event 3
par = {"optional_integer": 42, "multiple_string": [None]} JavaScript// event 0
const par = {"optional_integer": 42, "multiple_string": ["a", undefined, "c"]}
// event 1
const par = {"optional_integer" : 44, "multiple_string": undefined} // breaking change
// event 2
const par = {"optional_integer": undefined, "multiple_string": []} // breaking change
// event 3
const par = {"optional_integer": 42, "multiple_string": [undefined]} C#// event 0
var par = new { optional_integer = 42, multiple_string = new string[] { "a", null, "c" } };
// event 1
var par = new { optional_integer = 44, multiple_string = (string[])null };
// event 2
var par = new { optional_integer = (int?)null, multiple_string = new string[0] };
// event 3
var par = new { optional_integer = 42, multiple_string = new string[] { null } }; Scalacase class Par (
optional_integer: Some[Int],
multiple_string: Some[List[String]] // breaking change
)
// event 0
val par = Par(optional_integer = Some(42), multiple_string = Some(List("a", null, "c")))
// event 1
val par = Par(optional_integer = Some(44), multiple_string = None)
// event 2
val par = Par(optional_integer = None, multiple_string = Some(List()))
// event 3
val par = Par(optional_integer = Some(42), multiple_string = Some(List(null))) Bash# event0
par_optional_integer=42
par_multiple_string="a;UNDEFINED_ITEM;c"
# event1
par_optional_integer=44
# par_multiple_string is unset
# event2
# par_optional_integer is unset
par_multiple_string=""
# event3
par_optional_integer=42
par_multiple_string="UNDEFINED_ITEM" ConclusionThe following set of changes are proposed. Breaking changes
New functionality
|
Feature summary
Not sure if this is a bug or a feature, since I don't know what your expected behaviour is for this 😛
Passing
PARAMETER=1:2:null:4
to an argument that accepts multiple integer values should not fail parameter validationFeature description
Passing a
null
value as part of an argument that accepts multiple integer values, in the format with the multiple separator (e.g1:2:null:4
), the validation for the parameters currently fails with following error message:Error in module '<MODULE>' id '<ID>': input argument '<ARGUMENT>' has the wrong type. Expected type: List[Integer]. Found type: List[java.lang.String]
This causes issues if you have 2 parameters that both accept multiple values that rely on eachother. I will try to explain myself better with an example.
Example
Imagine following parameters:
If you have 3 features called A,B and C and you want to provide cutoff values for all of them, but for some features you would like to skip a certain cutoff (can be done by passing
null
), it should be possible to pass the arguments as such:... --features=A:B:C --cutoff_low=4:null:3 --cutoff_high=100:400:null
The order and amount of values for each argument is important, since they are linked to each other. Skipping the
null
values is therefor not possible.My example shows this in a CLI context, but my usecase is more in a Nextflow workflow. The principles remain the same tho...
Why is this feature beneficial?
I don't know if this feature would be widely usable, or if it is just for my usecase...
However, I still think that this should be valid since an optional argument of type integer (or any type for that matter) does accept
null
. Making that argument accept multiple (integer) values, shouldn't change that behaviour IMO.Alternatives considered
I could try to change my script to accept the parameters in a different format, but currently don't have any ideas to work around it. Suggestions are welcome 😄
Possible solution
The fix seems rather simple to me. If the value is
"null"
(null as string), convert to an actualnull
and run the normal parameter validation.I however am not sure if this could/would cause issues somewhere else...
Confirmation
The text was updated successfully, but these errors were encountered: