Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp run --set-param fails to update param value #5777

Closed
scordee opened this issue Apr 6, 2021 · 5 comments
Closed

exp run --set-param fails to update param value #5777

scordee opened this issue Apr 6, 2021 · 5 comments
Labels
A: experiments Related to dvc exp bug Did we break something? p3-nice-to-have It should be done this or next sprint

Comments

@scordee
Copy link

scordee commented Apr 6, 2021

Bug Report

exp run --set-param params.py:TrainConfig.TEMP=0.0001 : does not update TrainConfig.TEMP=1e-8

Description

When trying to run an experiment --set-param does not update class variable TrainConfig.TEMP=1e-8 to TrainConfig=0.0001 declared and defined in params.py; however, --set-param does update TrainConfig.TEMP=0.0001 to TrainConfig:TEMP=1e-8.

Reproduce

Example:

  1. Consider the Data Pipelines tutorial example with all the stages already defined in the dvc.yaml file. The yaml file contains stages prepare, featurize, train and evaluate.
  2. Besides the params.yaml file also utilise the example params.py python file, provided here
  3. Add a variable TEMP=1e-8 to TrainConfig class in params.py file
# All standard variable types are supported.
BOOL = True
INT = 5
FLOAT = 0.001
STR = 'abc'
DICT = {'a': 1, 'b': 2}
LIST = [1, 2, 3]
SET = {4, 5, 6}
TUPLE = (10, 100)
NONE = None

# DVC can retrieve class constants and variables defined in __init__
class TrainConfig:

    EPOCHS = 70
    TEMP = 1e-8

    def __init__(self):
        self.layers = 5
        self.layers = 9  # TrainConfig.layers param will be 9
        self.sum = 1 + 2  # Will NOT be found due to the expression
        bar = 3  # Will NOT be found since it's locally scoped


class TestConfig:

    TEST_DIR = 'path'
    METRICS = ['metric']
  1. Add below parameter dependencies to your Data Pipelines dvc.yaml file, such that your train stage looks like below
train:
    cmd: python src/train.py data/features model.pkl
    deps:
    - data/features
    - src/train.py
    params:
    - train.min_split
    - train.n_est
    - train.seed
    - params.py:
      - BOOL
      - INT
      - FLOAT
      - TrainConfig.EPOCHS
      - TrainConfig.TEMP
      - TrainConfig.layers
    outs:
    - model.pkl
  1. dvc repro --force
  2. dvc exp run --set-param params.py:TrainConfig.TEMP=0.0001

Expected

A new experiment with param value TrainConfig.TEMP=0.0001

Environment information

Output of dvc doctor:

DVC version: 2.0.15 (pip)
---------------------------------
Platform: Python 3.9.1 on Linux-5.4.0-70-generic-x86_64-with-glibc2.31
Supports: azure, hdfs, http, https
Cache types: hardlink, symlink
Cache directory: ecryptfs on /home/<username>/.Private
Caches: local
Remotes: local
Workspace directory: ecryptfs on /home/<username>/.Private
Repo: dvc, git

Additional Information (if any):

@dberenbaum
Copy link
Collaborator

Could you please see the response in https://discord.com/channels/485586884165107732/485596304961962003/829087046173327423 and follow up? Thanks!

@scordee
Copy link
Author

scordee commented Apr 7, 2021

Thank you for the response.

To shed some more light on the issue, with initial TEMP value equal to 1e-8 dvc repro works fine, all the tutorial stages are reproduced using pyyaml as I only read in the params.yaml file into the tutorials' src/train.py not import params.py. In short, all I do is declare the TEMP value from params.py as a parameter dependency in dvc.yaml and do not use it anywhere. However, the question is whether the TEMP value is read in as float or str, the dvc.lock file does contain TEMP=1e-08. But as dvc uses ruamel.yaml, this is read in as float, I guess.

As suggested, I now set the TEMP value to 8.0e-5 in params.py and run dvc repro --force. The dvc.lock file does contain TEMP=8e-05. Then tried dvc exp run --set-param params.py:TrainConfig.TEMP=1.0e-4. The experiment does not update the TEMP value in dvc.lock or params.py file. Again, when I set TEMP value to 0.001 in params.py and run dvc repro --force. The dvc.lock file does contain TEMP=0.001and then when I try dvc exp run --set-param params.py:TrainConfig.TEMP=1.0e-7. The file dvc.lock and param.py is updated and a new experiment with updated TEMP parameter is performed.

@skshetry
Copy link
Member

skshetry commented Apr 7, 2021

@scordee, the support for the python params file is not great with DVC, it's very experimental and sadly not in a very good state. If you can, please use the yaml/toml/json files.


Regarding the bug, we are working internally with the parsed object. We get it as 1e-08 which we try to replace later, but actually, the text in the code is 1e-8 due to which it fails. Looking at the code, I'd really love to see AST based solution (parse into ast -> modify ast -> dump ast) than the current solution where we:

  1. Parse into AST
  2. Convert AST to a dictionary. Also add some stuffs of interests into it such as original source, lineno etc.
  3. Modify dict.
  4. Modify the source based on the new dictionary (by replacing old value with the new one).
  5. Dump the source (with ast.parse sanity check).

It fails on step 4, as the old value that we get is parsed to 1e-08 which we are trying to replace, instead of the actual 1e-8. The easier workaround might be to save the code segments as well in the step 2 and then use it later.

@skshetry skshetry added bug Did we break something? p2-medium Medium priority, should be done, but less important labels Apr 7, 2021
@dberenbaum
Copy link
Collaborator

@scordee Thanks for the detailed feedback to identify this bug! If you need a workaround for now, @skshetry already suggested that you could put the parameter into a yaml/toml/json file, or you could manually edit params.py instead of using --set-param.

@dmpetrov dmpetrov added the A: experiments Related to dvc exp label Apr 13, 2021
@scordee
Copy link
Author

scordee commented Apr 14, 2021

Thank you everyone for the quick reactions. Appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp bug Did we break something? p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants