Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release] 1.7.0 tracker #5779

Closed
22 of 23 tasks
Bobgy opened this issue Jun 2, 2021 · 26 comments
Closed
22 of 23 tasks

[release] 1.7.0 tracker #5779

Bobgy opened this issue Jun 2, 2021 · 26 comments
Assignees

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Jun 2, 2021

UPDATE: 1.7.0 release is out! https://github.com/kubeflow/pipelines/releases/tag/1.7.0

Code release blockers:

After code release:

Docs release blockers:

DONE:

After release:

Prepare:

@Bobgy Bobgy self-assigned this Jun 2, 2021
@Bobgy Bobgy changed the title [release] 1.6.1 tracker [release] 1.7.0 tracker Jun 4, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 4, 2021

Updated to 1.7.0 release, because we have big changes like upgrading MLMD store to 1.0.0

@Bobgy
Copy link
Contributor Author

Bobgy commented Jul 6, 2021

kubeflow/testing@03c6258
updated kfp-ci test cluster to 1.7.0-alpha.2 for further testing (with argo v3.1.1)

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

Issue: #6294

kfp gcp marketplace -- support section should include github issues, instead of slack channel, because maintainers only check github issues regularly. Also, we should add a link to documentations.
image

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

Issue: #6294

mkp "use emissary executor" option tool tip is malformed, we cannot use markdown there:
image

@james-jwu
Copy link
Contributor

james-jwu commented Aug 11, 2021

Issue: #6306

FR: add v2 DSL sample to the "Getting Started" page

The sample was added as a built-in Pipeline sample. We should add it to the "Getting Started" page as well.

EDIT:
similar FR by @Bobgy: add v2 DSL documentation to "Getting Started" page

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

Issue: #6307

pipeline root parameter is called pipeline-output-directory in KFP UI/SDK
can we rename it to pipeline-root to be consistent with SDK API?

@james-jwu
Copy link
Contributor

james-jwu commented Aug 11, 2021

EDIT: already fixed, see #5779 (comment)

Running the sample that adds integers passes the 1st time, but fails the 2nd time:
F0811 23:31:33.486213 15 main.go:50] Failed to execute component: failed to store output parameter value from cache: failed to parse parameter name="Output" value =15 to double: %!w()

https://2d99063a196f3ce0-dot-us-central1.pipelines.googleusercontent.com/#/runs/details/4de32ffc-f51f-4fc5-95d2-d084213d9f56

import kfp
import kfp.dsl as dsl
from kfp.v2.dsl import component

@component
def add(a: float, b: float) -> float:
  '''Calculates sum of two arguments'''
  return a + b

@dsl.pipeline(
  name='addition-pipeline',
  description='An example pipeline that performs addition calculations.',
#   pipeline_root='gs://my-pipeline-root/example-pipeline'
)
def add_pipeline(a: float=1, b: float=7):
  add_task = add(a, b)

from kfp import compiler
compiler.Compiler(mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE).compile(pipeline_func=add_pipeline, package_path='pipeline.yaml')

client.create_run_from_pipeline_func(
    add_pipeline,
    arguments={'a': 7, 'b': 8},
    mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE,
)

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

I think above error was fixed by 2b78d16#diff-e5ea1eac6ea305b6e11fb743d381b3d9189a086e1a88ac8bf53bf64add225977L328-R330.
We just need to release KFP SDK 1.7.1 with v2 compatible.

@rui5i
Copy link
Contributor

rui5i commented Aug 11, 2021

The status icon doesn't get updated until refresh the page. (Is this not supported yet or a bug?)
Screen Shot 2021-08-11 at 4 44 20 PM

@rui5i
Copy link
Contributor

rui5i commented Aug 11, 2021

labels name is a bit confusing:
pipelines.kubeflow.org/cache_enabled: 'true'
pipelines.kubeflow.org/enable_caching: 'true'

Maybe we should add a documentation for this two labels.

@capri-xiyue
Copy link
Contributor

labels name is a bit confusing:
pipelines.kubeflow.org/cache_enabled: 'true'
pipelines.kubeflow.org/enable_caching: 'true'

Maybe we should add a documentation for this two labels.

pipelines.kubeflow.org/cache_enabled: 'true' is for v1 and pipelines.kubeflow.org/enable_caching: 'true' is for v2 compatible and v2. We should document it somewhere

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

PR: #6309

When I deploy a new AI Platform Pipelines cluster with managed storage, the v2 compatible tutorial does not work out of the box, because it's still defaulting to minio://mlpipeline bucket, but the bucket doesn't exist.

F0811 23:48:42.309367 20 main.go:72] Failed to execute component: failed to upload output artifact "output_dataset_one" to remote storage URI "minio://mlpipeline/v2/artifacts/pipeline/[Tutorial] V2 lightweight Python components/31b88af6-8aad-45a1-aed8-58c935ea3fb5/preprocess/output_dataset_one": uploadFile(): unable to complete copying "/minio/mlpipeline/v2/artifacts/pipeline/[Tutorial] V2 lightweight Python components/31b88af6-8aad-45a1-aed8-58c935ea3fb5/preprocess/output_dataset_one" to remote storage "pipeline/[Tutorial] V2 lightweight Python components/31b88af6-8aad-45a1-aed8-58c935ea3fb5/preprocess/output_dataset_one": failed to close Writer for bucket: blob (key "pipeline/[Tutorial] V2 lightweight Python components/31b88af6-8aad-45a1-aed8-58c935ea3fb5/preprocess/output_dataset_one") (code=Unknown): AccessDenied: Access Denied.
status code: 403, request id: 169A656A37803974, host id:

fix: we should configure default pipeline root to GCS, when managed storage is enabled.

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 11, 2021

Issue: #6308

UI bug, for v2 compatible mode, output artifacts have URI "gcs://xxxx", but it should be "gs://xxxx", when using gcs bucket as pipeline root.

image

@james-jwu
Copy link
Contributor

Compiling the Vertex Pipeline sample produced an error:
...
/opt/conda/lib/python3.7/site-packages/kfp/compiler/v2_compat.py in update_op(op, pipeline_name, pipeline_root, launcher_image)
127 k8s_client.V1EnvFromSource(config_map_ref=config_map_ref))
128
--> 129 op.arguments = list(op.container_spec.command) + list(op.container_spec.args)
130
131 runtime_info = {

AttributeError: 'NoneType' object has no attribute 'command'

@zijianjoy
Copy link
Collaborator

My attempt to fix the TFX taxi prediction tutorial issue: I tried to download tfx dependency from the first cell of https://github.com/kubeflow/pipelines/blob/master/samples/core/parameterized_tfx_oss/taxi_pipeline_notebook.ipynb, I removed the tfx version specification flag:

!python3 -m pip install pip --upgrade --quiet --user

I tried this as well:

pip install tfx

but it has been very slow on my notebook to finish the download. And I cannot run them locally because tfx supports ubuntu and linux only: tensorflow/tfx#1229.

@capri-xiyue
Copy link
Contributor

capri-xiyue commented Aug 12, 2021

How should cache work when everything is same except the pipeline_root?
For example,
pipeline #1 uses minio://xxxx as the pipeline_root
pipeline #2 uses gs://xxxx as the pipeline_root
other configs of pipeline#1 and pipeline#2 are same.
Now pipeline #2 will hit the cache of pipeline #1 but it won't use the pipeline root under gs://xxxx, it will use the pipeline root under minio://xxxx as what pipeline#1 does.

Discussed with @Bobgy @IronPan, this is the expected behavior.

EDIT by @Bobgy :
However, before implementing support for downloading artifacts not from the pipeline root, the pipeline still breaks.
TODOs:

  • add this as a known caveat
  • when supporting importer, we need to do the same fix

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 12, 2021

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 12, 2021

We need documentation for pipeline root

#6310

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 12, 2021

@rui5i yes, that's expected behavior for now.
cc @zijianjoy to consider as a FR.

#5779 (comment)

@zijianjoy
Copy link
Collaborator

@rui5i yes, that's expected behavior for now.
cc @zijianjoy to consider as a FR.

#5779 (comment)

@Bobgy @rui5i Sounds good, created #6317 to track this issue.

@zijianjoy
Copy link
Collaborator

I got the following errors when I run the following samples using KFP backend 1.7.0-rc.3 and SDK 1.7.0.

python3 -m samples.test.metrics_visualization_v2_test
python3 -m samples.test.metrics_visualization_v1_test

The error I got:

viserror

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 13, 2021

Until this comment, every open issue has been added to https://github.com/kubeflow/pipelines/projects/13#column-14866979.

@zijianjoy
Copy link
Collaborator

Hello @chensun , should we export HTML and Markdown in https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/dsl/__init__.py#L23? I am not able to use them when building visualization pipeline.

@zijianjoy
Copy link
Collaborator

zijianjoy commented Aug 13, 2021

Update: I will send out a PR to fix this. Reason is that MLMD custom properties name has been changed to display_name

All visualization in v2 syntax is not available:
visualizationempty

Logs

time="2021-08-13T18:25:45.311Z" level=info msg="capturing logs" argo=true
I0813 18:25:45.368149      15 cache.go:120] Connecting to cache endpoint 10.119.242.61:8887
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 
https://pip.pypa.io/warnings/venv
Loading KFP component module ephemeral_component from dir /tmp/tmp.ibcrdjWT5C
W0813 18:26:51.561024      15 launcher.go:548] Local filepath "/gcs/jamxl-kfp-bucket/v2/pipeline/metrics-visualization-pipeline/6872a7f7-70e7-470b-937b-9475b684ba1d/wine-classification/metrics" does not exist
time="2021-08-13T18:26:51.674Z" level=info msg="/tmp/outputs/metrics/data -> /var/run/argo/outputs/artifacts//tmp/outputs/metrics/data.tgz" argo=true
time="2021-08-13T18:26:51.675Z" level=info msg="Taring /tmp/outputs/metrics/data"

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 14, 2021

@zijianjoy visualization testing has always been an area of headache, because it needs to be done manually. I think it's possible to automate this by building an e2e test that

  1. run the visualization pipeline, returns a URI to the run details page
  2. run a UI e2e test (maybe use cypress) that goes to the run details page, click the node and visualization tab, verify the visualization tab content.

This can remove the need for manual testing.

Recording this idea for reference, we can continue to observe how often it breaks.

@Bobgy
Copy link
Contributor Author

Bobgy commented Sep 9, 2021

1.7.0 on Google Cloud Marketplace also released!

@Bobgy Bobgy closed this as completed Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants