Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify New Parameters #16430

Closed
paantya opened this issue Jun 14, 2021 · 32 comments
Closed

Specify New Parameters #16430

paantya opened this issue Jun 14, 2021 · 32 comments
Labels

Comments

@paantya
Copy link

paantya commented Jun 14, 2021

When is it planned to support changing in UI the default startup DAG parameters to be used in python code?

something like "perfec"t has:
https://docs.prefect.io/orchestration/tutorial/parameters.html#specify-new-parameters

something like "DVC" has, in the near future:
https://www.youtube.com/watch?v=nXJXR-zBvHQ&t=203s&ab_channel=DVCorg

@boring-cyborg
Copy link

boring-cyborg bot commented Jun 14, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented Jun 14, 2021

@potiuk potiuk closed this as completed Jun 14, 2021
@potiuk potiuk added the invalid label Jun 14, 2021
@potiuk
Copy link
Member

potiuk commented Jun 14, 2021

For the future @paantya, please use GitHub discussions for similar questions, not issues.

@paantya
Copy link
Author

paantya commented Jun 14, 2021

@potiuk As I understand it, this can only be used in the operator template field, but not in python code, passing parameters to models.
Or did I get it wrong? and do you have an example of using this area, indicating the parameters, which are then used in the python code?
Please clarify

@paantya
Copy link
Author

paantya commented Jun 14, 2021

Do you have an example of passing these values as arguments to Python code?

@potiuk
Copy link
Member

potiuk commented Jun 14, 2021

It's triggering DAG run. Those are parameters for the whole DAG to run. You can use those parameters in your DAG definition to pass them to any underlying operators:
The description is here including examples: https://airflow.apache.org/docs/apache-airflow/stable/dag-run.html#passing-parameters-when-triggering-dags

@paantya
Copy link
Author

paantya commented Jun 14, 2021

@potiuk
then another question is, can we configure that the current default parameters are displayed in this UI window, with which the DAG will be launched?

@paantya
Copy link
Author

paantya commented Jun 14, 2021

I have not found such an example of how to do this, as in Perfect or DVC

@paantya
Copy link
Author

paantya commented Jun 14, 2021

I want to take out all the potentially changeable parameters in the UI so that we can change them if needed.

@potiuk
Copy link
Member

potiuk commented Jun 14, 2021

The Template for this config is not there, but this is a nice feature to add. If you want to add template - then by all means create a feature request (just make sure you follow the template and describe exactly what you want) , possibly you can submit a PR with that? Airflow is a community-driven project so our users are welcome to make PRs.

@paantya
Copy link
Author

paantya commented Jun 14, 2021

take out = edited to config params

@paantya
Copy link
Author

paantya commented Jun 14, 2021

I will try to arrange it

@paantya
Copy link
Author

paantya commented Jun 14, 2021

@potiuk for PR i need creat new issue "Feature request"? or create https://github.com/apache/airflow/pulls?

@potiuk
Copy link
Member

potiuk commented Jun 14, 2021

If you want to start on PR, there is no need to create Feature request at all.

@msumit
Copy link
Contributor

msumit commented Jun 15, 2021

@paantya you can use params instead of dag_run.conf, like it has been used here. The params dict will be available on UI to edit before triggering a dag. Then u can use {{ params['example_key'] }} in code to get that value.
Screenshot 2021-06-15 at 4 22 54 PM

@potiuk
Copy link
Member

potiuk commented Jun 15, 2021

Ah nice! Thankss @msumit TIL

@potiuk
Copy link
Member

potiuk commented Jun 15, 2021

@paantya maybe you would like to make a change to our documentattion describing it, if it is not clear enough? That would be nice first contribution to Airflow.

@paantya
Copy link
Author

paantya commented Jun 15, 2021

:slowpoke:
I apologize, I did not understand correctly, unfortunately, now I do not fully understand what is needed to display the default parameters.
The easiest option for me is to make a display of parameters from fonfig, which I would indicate as a configuration one (which can be one file or similar to working with hydra-core configs) and so that values are picked up from it.

@paantya
Copy link
Author

paantya commented Jun 15, 2021

@msumit
looks interesting!
please tell me, do you have an example of how to call/use this in the python code?

@paantya
Copy link
Author

paantya commented Jun 15, 2021

@potiuk as I figure it out, I would gladly add it)

@paantya
Copy link
Author

paantya commented Jun 15, 2021

@potiuk please tell me, but you can display the parameters in .yaml format and not in .json?

@msumit
Copy link
Contributor

msumit commented Jun 15, 2021

@paantya it should be straightforward to use.

Specify params while initiating DAG, for example

with DAG(
    'tutorial',
    default_args=default_args,
    schedule_interval=timedelta(days=1),
    start_date=days_ago(2),
    params={"sleep": 5},
) as dag:

& then use it in a task, like this

    t2 = BashOperator(
        task_id='sleep',
        depends_on_past=False,
        bash_command="sleep {{ params['sleep'] }}",
    )

@paantya
Copy link
Author

paantya commented Jun 15, 2021

tnx, cool!

@msumit
if i want to use python code and put all possible parameters here?
these parameters are about 200

Will these parameters need to be passed each in the form of terminal parameters? or can we somehow save them all into a file or transfer them to python?
do you have any ideas?)

@paantya
Copy link
Author

paantya commented Jun 15, 2021

my smale config .yaml

project_name: "QL/SGOB/"

experiment:
  dataset: &dataset_name "data_test_5e5"
  name: &experiment_name "medium_size_clustering_estimation"

logger:
  -
    name: "Console"
  -
    name: "File"

data_preparation:
  key: "histogram"
  preprocess: "normalize_histogram_mean_std"

clustering:
  algo: "kmeans"
  tests: 10
  low: 15
  high: 200
  step: 10

estimation:
  metrics: ["calinski_harabasz", "minus_davies_bouldin"]
  file: true

@paantya
Copy link
Author

paantya commented Jun 15, 2021

so far it has come to mind that it will be necessary to specify each parameter in the launch parameters, like params['clustering']['high'] :

params = {
  "clustering": {
    "high": 200, 
    "tests": 10, 
    "algo": "kmeans", 
    "step": 10, 
    "low": 15
  }, 
  "project_name": "QL/SGOB/", 
  "experiment": {
    "name": "medium_size_clustering_estimation", 
    "dataset": "data_test_5e5"
  }, 
  "logger": [
    {
      "name": "Console"
    }, 
    {
      "name": "File"
    }
  ], 
  "estimation": {
    "metrics": [
      "calinski_harabasz", 
      "minus_davies_bouldin"
    ], 
    "file": true
  }, 
  "data_preparation": {
    "preprocess": "normalize_histogram_mean_std", 
    "key": "histogram"
  }
}
with DAG(
    'tutorial',
    default_args=default_args,
    schedule_interval=timedelta(days=1),
    start_date=days_ago(2),
    params=params,
) as dag:

& then use it in a task, like this

t2 = BashOperator(
    task_id='run test',
    depends_on_past=False,
    bash_command="python3.7 test.py params.clustering.high={{ params['clustering']['high'] }}",
)

@paantya
Copy link
Author

paantya commented Jun 15, 2021

and do the same with all the other parameters ..

@paantya
Copy link
Author

paantya commented Jun 15, 2021

can I somehow save these parameters to a file?
I would then make an additional task in the DAG, which would deal with the saving of parameters to a file and then in the very start of the code I would simply indicate this temporary file with the launch parameters

@paantya
Copy link
Author

paantya commented Jun 15, 2021

like

t2 = BashOperator(
    task_id='run test',
    depends_on_past=False,
    bash_command="python3.7 test.py params.config_file={{ <pth_to_tmp_config_file> }}",
)

@msumit
Copy link
Contributor

msumit commented Jun 15, 2021

@paantya ideally you should be able to read the file in your DAG and create a dict from that & use that as params to the DAG. However, as the DAG data get serialized and stored in DB, so not sure if that breaks something.

@paantya
Copy link
Author

paantya commented Jun 15, 2021

tnx

is there a way to pass DAG parameters without explicitly specifying them in the call string?
or am I misunderstanding something

@paantya
Copy link
Author

paantya commented Jun 16, 2021

Is it possible somehow to save the values passed to params to disk?

I can imagine that we can prepare a dictionary for params using a function (not tested yet).

def get_params():
    import yaml
    with open("params.yaml", 'r') as stream:
        try:
            return yaml.safe_load(stream)
        except yaml.YAMLError as exc:
            print(exc)

with DAG(
    'tutorial',
    default_args=default_args,
    schedule_interval=timedelta(days=1),
    start_date=days_ago(2),
    params=get_params(),
) as dag:

@paantya
Copy link
Author

paantya commented Aug 27, 2021

@msumit Can you please tell me, is it possible in params to draw the output in the form of yaml, not json? in UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants