Could not parse metadata file from S3 #3388

Jeffwan · 2020-03-29T22:50:09Z

What steps did you take:

I configure argo, KFP UI and KFP APIServer to use S3 as artifact store.

What happened:

Could not parse metadata file at: artifacts/iris-classification-pipeline-4ktlj/iris-classification-pipeline-4ktlj-3034405897/mlpipeline-ui-metadata.tgz. Error: SyntaxError: Unexpected token � in JSON at position 0

It's a little bit weird this is working fine if artifact is stored in Minio but not in S3. As you see in the screenshot, it has some unreadable chars and I verify that bucket and key path are correct and I can fetch file using command.

I am not sure if the way we persist file matters. I download both files from minio and s3. Both tgz and unzipped json return the same file type.

$ file ~/Downloads/mlpipeline-ui-metadata.tgz
~/Downloads/mlpipeline-ui-metadata.tgz: gzip compressed data, last modified: Sun Mar 29 06:24:22 2020, from Unix

$ file ~/Downloads/mlpipeline-ui-metadata.json
~/Downloads/mlpipeline-ui-metadata.json: ASCII text, with very long lines, with no line terminators

The only difference I know is in S3, the file metadata has Content-Type: application/x-gtar-compressed and I download minio file and reupload to S3, it has content type Content-Type: application/gzip but this still doesn't work.

Python code I used to persist the artifact.

from tensorflow.python.lib.io import file_io
with file_io.FileIO('/tmp/mlpipeline-ui-metadata.json', 'w') as f:
        json.dump(metadata, f)

{
  "outputs": [
    {
      "storage": "inline",
      "source": "# Inline Markdown\n[A link](https://www.kubeflow.org/)",
      "type": "markdown"
    },
    {
      "source": "https://raw.githubusercontent.com/kubeflow/pipelines/master/README.md",
      "type": "markdown"
    },
    {
      "type": "confusion_matrix",
      "format": "csv",
      "schema": [
        {
          "name": "target",
          "type": "CATEGORY"
        },
        {
          "name": "predicted",
          "type": "CATEGORY"
        },
        {
          "name": "count",
          "type": "NUMBER"
        }
      ],
      "source": "s3://jiaixn-kubeflow-pipeline-data/iris-example/confusion_matrix.csv",
      "labels": [
        "0",
        "1"
      ]
    },
    {
      "type": "tensorboard",
      "source": "s3://jiaixn-kubeflow-pipeline-data/iris-example/tb-logs"
    }
  ]
}

What did you expect to happen:

Environment:

How did you deploy Kubeflow Pipelines (KFP)?
Standalone

KFP version: 0.2.5

/kind bug
/area frontend
/area backend

The text was updated successfully, but these errors were encountered:

Jeffwan · 2020-03-30T00:26:57Z

#2992 Fix this issue and I think I will go try the latest version.

k8s-ci-robot added kind/bug area/frontend area/backend labels Mar 29, 2020

Jeffwan mentioned this issue Mar 31, 2020

S3 support in Kubeflow Pipelines #3405

Closed

Jeffwan closed this as completed Apr 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not parse metadata file from S3 #3388

Could not parse metadata file from S3 #3388

Jeffwan commented Mar 29, 2020 •

edited

Loading

Jeffwan commented Mar 30, 2020

Could not parse metadata file from S3 #3388

Could not parse metadata file from S3 #3388

Comments

Jeffwan commented Mar 29, 2020 • edited Loading

What steps did you take:

What happened:

What did you expect to happen:

Environment:

Jeffwan commented Mar 30, 2020

Jeffwan commented Mar 29, 2020 •

edited

Loading