Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boolean value might get unnecessarily uppercased #4996

Closed
johnnychen94 opened this issue Nov 30, 2020 · 8 comments · Fixed by #5004
Closed

Boolean value might get unnecessarily uppercased #4996

johnnychen94 opened this issue Nov 30, 2020 · 8 comments · Fixed by #5004
Assignees
Labels
A: templating Related to the templating feature bug Did we break something?

Comments

@johnnychen94
Copy link
Contributor

johnnychen94 commented Nov 30, 2020

Bug Report

Python uses True/False as boolean values, while in other languages it might not be the case (Julia uses true/false).

It seems that the correct behavior is dvc faithfully stores the params as strings, and only does type conversion (str -> bool/int/float) when it needs to output some statistics, e.g., dvc params diff.

This seems to be an edge case for bool values only, so feel free to close if you don't plan to fix it.

Please provide information about your setup

Output of dvc version:

DVC version: 1.10.2 (pip)
---------------------------------
Platform: Python 3.8.3 on Linux-5.4.0-54-generic-x86_64-with-glibc2.10
Supports: All remotes
Cache types: symlink
Cache directory: nfs4 on storage:/home
Caches: local
Remotes: s3
Workspace directory: nfs4 on storage:/home
Repo: dvc, git

Additional Information (if any):

# dvc.yaml
stages:
  build:
    cmd: julia -e "println(ARGS); println(typeof.(ARGS))" ${opt1} ${opt2} ${opt3}
# params.yaml
opt1: true
opt2: "true"
opt3: some_other_string

Notice that opt1 is transformed to "True" instead of "true".

jc@mathai3:~/Downloads/dvc_test$ dvc repro -f
Running stage 'build' with command:
	julia -e "println(ARGS); println(typeof.(ARGS))" True true some_other_string

["True", "true", "some_other_string"]
DataType[String, String, String]
@efiop efiop added the A: templating Related to the templating feature label Nov 30, 2020
@karajan1001
Copy link
Contributor

Here , the boolean value true and false should follow neither python nor julia but the yaml format or the json format, it is a bug I think.

@karajan1001 karajan1001 added the bug Did we break something? label Dec 1, 2020
@skshetry
Copy link
Member

skshetry commented Dec 1, 2020

The thing here is what boolean means on the shell context. So, the use of boolean is kind of undefined behaviour, though there's nothing wrong in making it return false or true, i.e. lowercased.

@skshetry skshetry self-assigned this Dec 1, 2020
@karajan1001
Copy link
Contributor

@skshetry
Shouldn't we deserialize these arguments through YAML, instead of passing them through the shell?

@skshetry
Copy link
Member

skshetry commented Dec 1, 2020

@karajan1001, they are stringified. The problem is that when YAML is loaded, it gets loaded as Python's boolean True/False which on str() becomes True/False.


in reality. the parser actually just generates a resolved dvc.yaml-like data, and substitutes.
So, if you add a cmd with just ${opt1}, the substitution happens as a boolean value.
If it has other string appended to it (eg: --value ${opt1}, then it is read as a string (with aforementioned logic of boolean stringification).

This thing has been limited by the schema that we use. It only allows validation, whereas we also need to convert those values.

https://github.com/iterative/dvc/blob/master/dvc/schema.py

@skshetry
Copy link
Member

skshetry commented Dec 1, 2020

I'll solve this issue by just converting boolean stringification to true/false as an edge-case.

@karajan1001
Copy link
Contributor

karajan1001 commented Dec 1, 2020

Why not use the dump method in YAML and JSON?
image

@skshetry
Copy link
Member

skshetry commented Dec 1, 2020

It's many times slower than the following for simple conversion:

def conv(boolean):
  return "true" if boolean else "false"

But, your point is valid, in that using json.dumps might allow us to provide better compatibility even on edge cases.

Also, we don't support the dict/list in cmd, so not much of a need to use those yet.

skshetry added a commit to skshetry/dvc that referenced this issue Dec 1, 2020
Previously, Python's `str()` was being used that
resulted in boolean transformed into "True"/"False".

Fixes iterative#4996
@skshetry
Copy link
Member

skshetry commented Dec 1, 2020

@johnnychen94, thanks for trying out parametrization. As you seem to use parameterization quite a lot, it'd be great if you could also provide some feedback overall (not just on the bugs but the feature 😜). How's been the experience?

You can either respond here or on the original issue #3633. We are hoping to release this feature at the end of the next month with a pre-release scheduled this week. So, any feedback is appreciated and will help us iron out the kinks.

Thanks again for the bug report. 🙏

skshetry added a commit that referenced this issue Dec 1, 2020
Previously, Python's `str()` was being used that
resulted in boolean transformed into "True"/"False".

Fixes #4996
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: templating Related to the templating feature bug Did we break something?
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants