Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler compiles invalid yaml when text is too long to be in one line #2495

Closed
l1990790120 opened this issue Oct 25, 2019 · 11 comments · Fixed by #3520
Closed

Compiler compiles invalid yaml when text is too long to be in one line #2495

l1990790120 opened this issue Oct 25, 2019 · 11 comments · Fixed by #3520

Comments

@l1990790120
Copy link
Contributor

What happened:

When running dsl-compile ... on a simple pipeline definition, it generates invalid pipeline yaml file as the following

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    pipelines.kubeflow.org/pipeline_spec: '{"description": "Example Experiment Pipeline",
      "inputs": [{"name": "model_id"}, {"name": "train_data_path"}, {"name": "validate_data_path"},
      {"name": "train_kwargs"}], "name": "Example Experiment Pipeline"}'
  generateName: experiment-pipeline-
spec:
...

When using the api to run the pipeline, it returns the following -

{"error":"Failed to create a new run.: Failed to fetch workflow spec.: Get pipeline YAML failed.: InternalServerError: Failed to unmarshal pipelines/20ee9d10-bb43-4c4f-b996-adcd2a29b0cd: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal bool into Go struct field Container.args of type string: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal bool into Go struct field Container.args of type string","message":"Failed to create a new run.: Failed to fetch workflow spec.: Get pipeline YAML failed.:  ...

Basically the yaml file is invalid.

What did you expect to happen:

The correct form where quoted text is broken up into multiple lines should be the following -

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    pipelines.kubeflow.org/pipeline_spec: >-
      {"description": "Example Experiment Pipeline", "inputs": [{"name":
      "train_data_path"}, {"name": "validate_data_path"}, {"default":
      "{\"params\":{\"eval_metric\": \"auc\",\"seed\": 0,\"num_round\":
      100,\"alpha\": 4,\"tree_method\": \"hist\",\"booster\":
      \"gbtree\",\"min_child_weight\": 10,\"eta\": 0.2,\"objective\":
      \"binary:logistic\",\"max_depth\": 10,\"gamma\": 3.0}}", "name":
      "train_kwargs"}], "name": "Example Experiment Pipeline"}
  generateName: experiment-pipeline-
spec:

Where it starts with >- and then the text

What steps did you take:
[A clear and concise description of what the bug is.]

This is not the case when calling .run_pipeline since it translates yaml into json and run the pipeline with json body.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

@Ark-kun Ark-kun self-assigned this Oct 25, 2019
@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 26, 2019

I'm not sure the problem is in the annotation.
You can just remove the annotation and try.

Let me know of the result.

@l1990790120
Copy link
Contributor Author

@Ark-kun I can manually remove or edit those lines then I trigger the pipeline through api successfully, but the main thing being if i have to manually edit those, ideally, dsl-compile should take care if the line is too long use ">-" instead of breaking it up with quotes

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 30, 2019

Can you elaborate on why you think the different styles of YAML string encodings are not equivalent?

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 31, 2019

Can you please post the whole problematic YAML?
I'm interested in this error: "cannot unmarshal bool into Go struct field Container.args of type string"

@l1990790120
Copy link
Contributor Author

l1990790120 commented Nov 4, 2019

@Ark-kun

compiling the following .py file

import os

import kfp
from kfp import dsl
import kfp.aws as aws
from kubernetes import client as k8sc

mlflow_experiment_op = kfp.components.load_component_from_file(
    os.path.join(
        os.getcwd(),
        "mlflow-experiments",
        "kubeflow",
        "components",
        "xgboost",
        "component.yaml",
    )
)


def compare_mlflow_model_op(
    model_ids, metrics=["roc_auc_score", "log_loss"], to_kubeflow="y"
):
    return dsl.ContainerOp(
        name="Compare MLFlow Models",
        image="xxx.xxx/mlflow-to-kubeflow",
        arguments=[
            "--models",
            model_ids,
            "--metrics",
            metrics,
            "--to_kubeflow",
            to_kubeflow,
        ],
        file_outputs={
            "mlpipeline-metrics": "/mlpipeline-metrics.json",
            "mlpipeline-ui-metadata": "/mlpipeline-ui-metadata.json",
        },
    ).add_env_variable(
        k8sc.V1EnvVar(
            name="MLFLOW_URL",
            value="https://xxx.xxx",
        )
    )


default_train_kwargs = """{"params":{"eval_metric": "auc","seed": 0,"num_round": 100,"alpha": 4,"tree_method": "hist","booster": "gbtree","min_child_weight": 10,"eta": 0.2,"objective": "binary:logistic","max_depth": 10,"gamma": 3.0}}"""


@kfp.dsl.pipeline(
    name="MLflow Experiment Pipeline", description="MLflow Experiment Pipeline"
)
def run_mlflow_experiment(
    model_id: str,
    train_data_path: str,
    validate_data_path: str,
    log_metrics: str,
    train_kwargs: str = default_train_kwargs,
):
    mlflow_experiment = mlflow_experiment_op(
        train_data_path=train_data_path,
        validate_data_path=validate_data_path,
        model_id=model_id,
        train_kwargs=train_kwargs,
        log_metrics=log_metrics,
    ).apply(
        aws.use_aws_secret(
            secret_name="aws-secrets",
            aws_access_key_id_name="AWS_ACCESS_KEY_ID",
            aws_secret_access_key_name="AWS_SECRET_ACCESS_KEY",
        )
    )
    mlflow_experiment.set_memory_request("8G")
    mlflow_experiment.set_cpu_request("4")

    compare_mlflow_model = compare_mlflow_model_op(model_ids=model_id)
    compare_mlflow_model.after(mlflow_experiment)

and compiles into this, if you upload this into a new pipeline, it'll throw the above error when trying to run the pipeline

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    pipelines.kubeflow.org/pipeline_spec: '{"description": "MLflow Experiment Pipeline",
      "inputs": [{"name": "model_id"}, {"name": "train_data_path"}, {"name": "validate_data_path"},
      {"name": "log_metrics"}, {"default": "{\"params\":{\"eval_metric\": \"auc\",\"seed\":
      0,\"num_round\": 100,\"alpha\": 4,\"tree_method\": \"hist\",\"booster\": \"gbtree\",\"min_child_weight\":
      10,\"eta\": 0.2,\"objective\": \"binary:logistic\",\"max_depth\": 10,\"gamma\":
      3.0}}", "name": "train_kwargs"}], "name": "MLflow Experiment Pipeline"}'
  generateName: mlflow-experiment-pipeline-
spec:
  arguments:
    parameters:
    - name: model-id
    - name: train-data-path
    - name: validate-data-path
    - name: log-metrics
    - name: train-kwargs
      value: '{"params":{"eval_metric": "auc","seed": 0,"num_round": 100,"alpha":
        4,"tree_method": "hist","booster": "gbtree","min_child_weight": 10,"eta":
        0.2,"objective": "binary:logistic","max_depth": 10,"gamma": 3.0}}'
  entrypoint: mlflow-experiment-pipeline
  serviceAccountName: pipeline-runner
  templates:
  - container:
      args:
      - --models
      - '{{inputs.parameters.model-id}}'
      - --metrics
      - '[''roc_auc_score'', ''log_loss'']'
      - --to_kubeflow
      - y
      env:
      - name: MLFLOW_URL
        value: https://xxx.xxx
      image: xxx.xxx/mlflow-to-kubeflow
    inputs:
      parameters:
      - name: model-id
    name: compare-mlflow-models
    outputs:
      artifacts:
      - name: mlpipeline-metrics
        path: /mlpipeline-metrics.json
      - name: mlpipeline-ui-metadata
        path: /mlpipeline-ui-metadata.json
  - dag:
      tasks:
      - arguments:
          parameters:
          - name: model-id
            value: '{{inputs.parameters.model-id}}'
        dependencies:
        - xgboost
        name: compare-mlflow-models
        template: compare-mlflow-models
      - arguments:
          parameters:
          - name: log-metrics
            value: '{{inputs.parameters.log-metrics}}'
          - name: model-id
            value: '{{inputs.parameters.model-id}}'
          - name: train-data-path
            value: '{{inputs.parameters.train-data-path}}'
          - name: train-kwargs
            value: '{{inputs.parameters.train-kwargs}}'
          - name: validate-data-path
            value: '{{inputs.parameters.validate-data-path}}'
        name: xgboost
        template: xgboost
    inputs:
      parameters:
      - name: log-metrics
      - name: model-id
      - name: train-data-path
      - name: train-kwargs
      - name: validate-data-path
    name: mlflow-experiment-pipeline
  - container:
      args:
      - --model_id
      - '{{inputs.parameters.model-id}}'
      - --train_data_path
      - '{{inputs.parameters.train-data-path}}'
      - --validate_data_path
      - '{{inputs.parameters.validate-data-path}}'
      - --build_xgboost_with
      - skip
      - --train_kwargs
      - '{{inputs.parameters.train-kwargs}}'
      - --log-metrics
      - '{{inputs.parameters.log-metrics}}'
      command:
      - python
      - -m
      - utils.kubeflow_run
      env:
      - name: MLFLOW_TRACKING_URI
        value: https://xxx.xxx
      - name: AWS_ACCESS_KEY_ID
        valueFrom:
          secretKeyRef:
            key: AWS_ACCESS_KEY_ID
            name: aws-secrets
      - name: AWS_SECRET_ACCESS_KEY
        valueFrom:
          secretKeyRef:
            key: AWS_SECRET_ACCESS_KEY
            name: aws-secrets
      image: xxx.xxx/mlflow/xgboost
      resources:
        requests:
          cpu: '4'
          memory: 8G
    inputs:
      parameters:
      - name: log-metrics
      - name: model-id
      - name: train-data-path
      - name: train-kwargs
      - name: validate-data-path
    metadata:
      annotations:
        pipelines.kubeflow.org/component_spec: '{"description": "Train xgboost model
          with xgboost package", "inputs": [{"description": "training data s3 path,
          note: must be in libsvm format", "name": "train_data_path", "type": "String"},
          {"description": "validating data s3 path, note: must be in libsvm format",
          "name": "validate_data_path", "type": "String"}, {"name": "model_id", "type":
          "String"}, {"name": "train_kwargs", "type": "String"}, {"name": "log_metrics",
          "type": "String"}], "name": "xgboost"}'
    name: xgboost

if i use the kfp's client.run_pipeline, it gives the correct yml. I've gone through the source code, in run_pipeline, it dumps yaml into json and submit as body in request payload, and perhaps this way, the rest backend (some other parts in go) is able to translate it correctly

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    pipelines.kubeflow.org/pipeline_spec: >-
      {"description": "MLflow Experiment Pipeline", "inputs": [{"name":
      "model_id"}, {"name": "train_data_path"}, {"name": "validate_data_path"},
      {"name": "log_metrics"}, {"default": "{\"params\":{\"eval_metric\":
      \"auc\",\"seed\": 0,\"num_round\": 100,\"alpha\": 4,\"tree_method\":
      \"hist\",\"booster\": \"gbtree\",\"min_child_weight\": 10,\"eta\":
      0.2,\"objective\": \"binary:logistic\",\"max_depth\": 10,\"gamma\":
      3.0}}", "name": "train_kwargs"}], "name": "MLflow Experiment Pipeline"}
  generateName: mlflow-experiment-pipeline-
spec:
  arguments:
    parameters:
      - name: model-id
      - name: train-data-path
      - name: validate-data-path
      - name: log-metrics
      - name: train-kwargs
        value: >-
          {"params":{"eval_metric": "auc","seed": 0,"num_round": 100,"alpha":
          4,"tree_method": "hist","booster": "gbtree","min_child_weight":
          10,"eta": 0.2,"objective": "binary:logistic","max_depth": 10,"gamma":
          3.0}}
  entrypoint: mlflow-experiment-pipeline
  serviceAccountName: pipeline-runner
  templates:
    - container:
        args:
          - '--models'
          - '{{inputs.parameters.model-id}}'
          - '--metrics'
          - '[''roc_auc_score'', ''log_loss'']'
          - '--to_kubeflow'
          - 'y'
        env:
          - name: MLFLOW_URL
            value: 'https://xxx.xxx'
        image: xxx.xxx/mlflow-to-kubeflow
      inputs:
        parameters:
          - name: model-id
      name: compare-mlflow-models
      outputs:
        artifacts:
          - name: mlpipeline-metrics
            path: /mlpipeline-metrics.json
          - name: mlpipeline-ui-metadata
            path: /mlpipeline-ui-metadata.json
    - dag:
        tasks:
          - arguments:
              parameters:
                - name: model-id
                  value: '{{inputs.parameters.model-id}}'
            dependencies:
              - xgboost
            name: compare-mlflow-models
            template: compare-mlflow-models
          - arguments:
              parameters:
                - name: log-metrics
                  value: '{{inputs.parameters.log-metrics}}'
                - name: model-id
                  value: '{{inputs.parameters.model-id}}'
                - name: train-data-path
                  value: '{{inputs.parameters.train-data-path}}'
                - name: train-kwargs
                  value: '{{inputs.parameters.train-kwargs}}'
                - name: validate-data-path
                  value: '{{inputs.parameters.validate-data-path}}'
            name: xgboost
            template: xgboost
      inputs:
        parameters:
          - name: log-metrics
          - name: model-id
          - name: train-data-path
          - name: train-kwargs
          - name: validate-data-path
      name: mlflow-experiment-pipeline
    - container:
        args:
          - '--model_id'
          - '{{inputs.parameters.model-id}}'
          - '--train_data_path'
          - '{{inputs.parameters.train-data-path}}'
          - '--validate_data_path'
          - '{{inputs.parameters.validate-data-path}}'
          - '--build_xgboost_with'
          - skip
          - '--train_kwargs'
          - '{{inputs.parameters.train-kwargs}}'
          - '--log-metrics'
          - '{{inputs.parameters.log-metrics}}'
        command:
          - python
          - '-m'
          - utils.kubeflow_run
        env:
          - name: MLFLOW_TRACKING_URI
            value: 'https://xxx.xxx'
          - name: AWS_ACCESS_KEY_ID
            valueFrom:
              secretKeyRef:
                key: AWS_ACCESS_KEY_ID
                name: aws-secrets
          - name: AWS_SECRET_ACCESS_KEY
            valueFrom:
              secretKeyRef:
                key: AWS_SECRET_ACCESS_KEY
                name: aws-secrets
        image: xxx.xxx/mlflow/xgboost
        resources:
          requests:
            cpu: '4'
            memory: 8G
      inputs:
        parameters:
          - name: log-metrics
          - name: model-id
          - name: train-data-path
          - name: train-kwargs
          - name: validate-data-path
      metadata:
        annotations:
          pipelines.kubeflow.org/component_spec: >-
            {"description": "Train xgboost model with xgboost package",
            "inputs": [{"description": "training data s3 path, note: must be in
            libsvm format", "name": "train_data_path", "type": "String"},
            {"description": "validating data s3 path, note: must be in libsvm
            format", "name": "validate_data_path", "type": "String"}, {"name":
            "model_id", "type": "String"}, {"name": "train_kwargs", "type":
            "String"}, {"name": "log_metrics", "type": "String"}], "name":
            "xgboost"}
      name: xgboost

@Ark-kun
Copy link
Contributor

Ark-kun commented Nov 5, 2019

@IronPan Is it possible that the Go's YAML parser is parsing YAML strings wrong? There should be no difference between the YAML's 4 text styles. Can you please check the parsing of the posted two pipelines?

@l1990790120
Copy link
Contributor Author

l1990790120 commented Nov 11, 2019

Hello - @Ark-kun and @IronPan

I am not sure if you guys have had a chance to take a look, I did a quick PR for this fix and raised a question on the PR as well. Let me know you guys' thought.

PR: #2591

@RunOrVeith
Copy link

For anyone stumbling over this in the future (like me):

This also happens if you create an environment variable with a non-string value, e.g.:

from kfp import dsl
from kubernetes.client.models import V1EnvVar

op: dsl.ContainerOp = ...  # some container op

# This will compile just fine, but when starting a run the following error will be raised:
op.add_env_variable(V1EnvVar(name="SOMETHING", value=1))

# Compile a pipeline with this op, upload it and try to run it...
{"error":"Failed to create a new run.: Failed to fetch workflow spec.: Get pipeline YAML failed.: InternalServerError: Failed to unmarshal pipelines/13a0e275-e6ee-46ce-a2c5-3589f0a4c2bd: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string","message":"Failed to create a new run.: Failed to fetch workflow spec.: Get pipeline YAML failed.: InternalServerError: Failed to unmarshal pipelines/13a0e275-e6ee-46ce-a2c5-3589f0a4c2bd: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string","code":13,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Internal Server Error","error_details":"Failed to create a new run.: Failed to fetch workflow spec.: Get pipeline YAML failed.: InternalServerError: Failed to unmarshal pipelines/13a0e275-e6ee-46ce-a2c5-3589f0a4c2bd: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string: error unmarshaling JSON: while decoding JSON: json: cannot unmarshal number into Go struct field EnvVar.value of type string"}]}

@l1990790120
Copy link
Contributor Author

l1990790120 commented Mar 11, 2020

@RunOrVeith I feel I have seen this error from time to time (type conflicts), but it's not really related to this issue here, if you could provide an example with another issue, I am happy to help looking into it.

@RunOrVeith
Copy link

@l1990790120 I opened a new issue describing the problem in #3286

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 16, 2020

@l1990790120
I have found the root cause.
Please read the writeup at #3519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants