Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Convert List[Any] to a single pickle file #1535

Merged

Conversation

Yicheng-Lu-llll
Copy link
Member

@Yicheng-Lu-llll Yicheng-Lu-llll commented Mar 6, 2023

TL;DR

Currently, flyte create N (size of the list) pickle files if the output type is List[Any]. This slows down serialization. it takes more than 15 mins to upload the pickles to s3 if the size of the list is 1000 (see Slack Thread)

This PR:

  • converts the entire list into a single pickle file by default.
  • takes batches of list items and uploads each batch as a separate pickle file if the user annotates the batch size(see below).
@task
def t1() -> Annotated[List[Any],  100]

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Originally, in ListTransformer.to_literal, Items are converted one by one. It is time-consuming, especially for large lists.

lit_list = [TypeEngine.to_literal(ctx, x, t, expected.collection_type) for x in python_val] # type: ignore

This PR checks if python_type is list[flytePickle]. If so, batch the list of items and call FlytePickleTransformer.to_literal.

Measure the performance

Test file:

from typing import List, Dict, Any
from flytekit import task, Resources, workflow

@task(
    limits=Resources(mem="4Gi",cpu="1"),
    disable_deck=True,
)
def test() -> List[Any]:
    return  [{"a": {0: "foo"}}] * 10000

@workflow
def wf():
    test()

if __name__ == "__main__":
    wf()

# yichenglu/flytekit:mytest5 contains this PR
# pyflyte run  --image yichenglu/flytekit:mytest5 --remote ./test.py  wf
# cr.flyte.org/flyteorg/flytekit:py3.10-1.4.0b1 is the default image
# pyflyte run  --image cr.flyte.org/flyteorg/flytekit:py3.10-1.4.0b1 --remote ./test.py  wf

In an 4VCPU, 16G RAM EC2 instance, measuring the performance in flyte cluster using the above test file and the commands:

  • Prior to these changes, , converting a list with type List[Any] with 10000 items needs 157 seconds.
  • After this PR, convertinga list with type List[Dict[str, Any]] with 10000 items only needs 7 seconds.
    image
    image

backward compatibility:

  • case1: old task's cached output -> new task
from typing import List, Dict, Any, Annotated
from flytekit import task, Resources, workflow
@task(
    limits=Resources(mem="4Gi",cpu="1"),
    disable_deck=True,
    container_image="{{.image.oldImage}}",
    cache=True, cache_version="1.0"   
)
def task0() -> List[List[Any]]:
    return   [ ["foo","foo","foo"] ] * 2
@task(
    limits=Resources(mem="4Gi",cpu="1"),
    disable_deck=True,
    container_image="{{.image.prImage}}",
)
def task1(data: List[List[Any]]) -> List[List[Any]]:
    print(data)
    return  data
@workflow
def wf():
    data = task0()
    task1(data=data)
# pyflyte run --image prImage="yichenglu/flytekit:mytest4" --image oldImage="cr.flyte.org/flyteorg/flytekit:py3.10-1.4.1" --remote ./test.py  wf

image

Backward compatibility is preserved in this case. Can see pod log ton double check.

  • case2: new task(image include this PR) -> old task(image without this PR)
from typing import List, Dict, Any, Annotated
from flytekit import task, Resources, workflow
@task(
    limits=Resources(mem="4Gi",cpu="1"),
    disable_deck=True,
    container_image="{{.image.prImage}}",
)
def task0() -> List[List[Any]]:
    return   [ ["foo","foo","foo"] ] * 2
@task(
    limits=Resources(mem="4Gi",cpu="1"),
    disable_deck=True,
    container_image="{{.image.oldImage}}"
)
def task1(data: List[List[Any]]) -> List[List[Any]]:
    print(data)
    return  data
@workflow
def wf():
    data = task0()
    task1(data=data)
# pyflyte run --image prImage="yichenglu/flytekit:mytest4" --image oldImage="cr.flyte.org/flyteorg/flytekit:py3.10-1.4.1" --remote ./test.py  wf

image
[[['foo', 'foo', 'foo']], [['foo', 'foo', 'foo']]] will be printed if see the pod's log
This occurs because in Task0, the outputs are batched, while in Task 1, which does not include this PR, the results are not expanded.

While slightly impacting backward compatibility, it is justifiable to ask users to update their images for two primary reasons:

  • Before the implementation of this PR, users faced lengthy wait times for tasks involving pickling to be completed.
  • Only tasks with the type list[any] will be affected. In most scenarios, users can still utilize the old images or cached outputs.

Tracking Issue

flyteorg/flyte#3207

Follow-up issue

NA

@welcome
Copy link

welcome bot commented Mar 6, 2023

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@codecov
Copy link

codecov bot commented Mar 7, 2023

Codecov Report

Merging #1535 (ea7e1aa) into master (ee2714f) will increase coverage by 0.29%.
The diff coverage is 84.72%.

@@            Coverage Diff             @@
##           master    #1535      +/-   ##
==========================================
+ Coverage   69.65%   69.95%   +0.29%     
==========================================
  Files         315      319       +4     
  Lines       29093    29585     +492     
  Branches     2761     5335    +2574     
==========================================
+ Hits        20265    20696     +431     
- Misses       8314     8370      +56     
- Partials      514      519       +5     
Impacted Files Coverage Δ
flytekit/types/pickle/__init__.py 0.00% <0.00%> (ø)
flytekit/types/pickle/pickle.py 45.16% <33.33%> (-1.27%) ⬇️
flytekit/core/type_engine.py 61.18% <80.00%> (+0.43%) ⬆️
flytekit/core/promise.py 52.70% <100.00%> (+0.17%) ⬆️
tests/flytekit/unit/core/test_flyte_pickle.py 91.80% <100.00%> (+0.42%) ⬆️
tests/flytekit/unit/core/test_promise.py 89.88% <100.00%> (+0.35%) ⬆️
tests/flytekit/unit/core/test_type_engine.py 98.62% <100.00%> (+0.03%) ⬆️

... and 31 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

flytekit/core/type_engine.py Outdated Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
flytekit/core/type_engine.py Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pingsutw
Copy link
Member

Prior to these changes, , converting a list with type List[Dict[str, Any]] with 1000 items needs 271.448063737 seconds.
After this PR, convertinga list with type List[Dict[str, Any]] with 1000 items only needs 2.990736610998283 seconds.

Amazing, thanks for testing it

@Yicheng-Lu-llll Yicheng-Lu-llll force-pushed the ConvertListAny]ToSinglePickleFile branch 2 times, most recently from debcfed to 01aba67 Compare March 13, 2023 01:55
pingsutw
pingsutw previously approved these changes Mar 14, 2023
Yicheng-Lu-llll and others added 12 commits March 25, 2023 18:03
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Have you tried to run it with cache enabled in the sandbox?

tests/flytekit/unit/core/test_type_engine.py Outdated Show resolved Hide resolved
tests/flytekit/unit/core/test_type_engine.py Outdated Show resolved Hide resolved
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! can we also do a quick end to end test please on sandbox if you haven't already

flytekit/core/promise.py Show resolved Hide resolved
flytekit/core/type_engine.py Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
flytekit/core/type_engine.py Outdated Show resolved Hide resolved
@Yicheng-Lu-llll
Copy link
Member Author

thanks! can we also do a quick end to end test please on sandbox if you haven't already

Certainly! May I know which test you are referring to? I have tested the following two cases(shown in the PR description):

  • old task's cached output -> new task
  • new task(image include this PR) -> old task(image without this PR)

Do you mean to add cases that demonstrate successful execution?

Yicheng-Lu-llll and others added 3 commits March 30, 2023 04:07
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@welcome
Copy link

welcome bot commented Apr 4, 2023

Congrats on merging your first pull request! 🎉

eapolinario pushed a commit that referenced this pull request May 16, 2023
* Convert List[Any] to a single pickle file

* remove redundant code

* keep batchSize only if type contain flytePickle

* fix error

* add batch support to translate_inputs_to_literals

* add ci test

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve comment

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* add more ci test

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* handle HashMethod case

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve format

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* add test_is_batchable

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* Add BatchSize

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test_batch_size

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: root <root@ip-172-31-10-58.ec2.internal>
Co-authored-by: Kevin Su <pingsutw@apache.org>
eapolinario added a commit that referenced this pull request May 17, 2023
* Data subsystem (#1526)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Pluck retry from flytekit and into sagemaker (#1411)

* Remove retry from flytekit's setup.py and regenerate requirements

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Add to sagemaker

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Remove retry from sagemaker plugin requirements file

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Restore doc-requirements.txt

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Fix bad merge

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Update the pypi wait (#1554)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Stream Directories and Files using Flyte (#1512)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Make `FlyteFile` compatible with `Annotated[..., HashMethod]` (#1544)

* fix: Make FlyteFile compatible with Annotated[..., HashMethod]

See issue #3424

Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

* tests: Add test case for FlyteFile with HashMethod annotation

Issue: #3424
Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

* fix: Use typing_extensions.Annotated for py3.8 compatibility

Issue: #3424
Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

* fix: Use `get_args` and `get_origin` from typing_extensions for py3.8 compatibility

Issue: #3424
Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

* fix(tests): Use fixture for local dummy file

Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

---------

Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>

* move FlyteSchema deprecation warning to initialization method (#1558)

* move FlyteSchema deprecation warning to initialization method

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add pod_template and pod_template_name arguments for ContainerTask (#1515)

* add pod_template and pod_template_name arguments for ContainerTask

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

* factor out _serialize_pod_spec into separate util function

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

* model file changes, couple other changes

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* minor cleanup

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

* add unit test for container_task pod_template

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

* bump min version of flyteidl to 1.3.12

for pod template data config support

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

* require flyteidl==1.3.12 in doc-requirements.txt

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>

---------

Signed-off-by: Felix Ruess <felix.ruess@roboception.de>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Pass locally defined scopes to RemoteClientConfigStore (#1553)

Signed-off-by: franco-bocci <franco.bocci@wolt.com>

* TypeTransformer for TensorFlow model (#1562)

* TypeTransformer for TensorFlow model

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* clean up

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* clean up

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* Remove fsspec flytekit plugin from main Dockerfile (#1561)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Device auth flow / Headless auth (#1552)

* Device auth flow

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Device AuthFlow is now available in flytekit

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* unit tests

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* test added

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* updated

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Fixed test

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Fixed unit test

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Fixed lint errors

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

---------

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* url encode secret in client credentials flow (#1566)

* url encode secret first

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* nit

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

---------

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Python run multiple files (#1559)

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update comment

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* General Partial support in flytekit and multi-list support in flytekit (#1556)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fix: Silence keyring warnings by changing to debug (#1568)

Co-authored-by: ggydush-fn <greg.gydush@freenome.com>

* Support GCP secrets (#1571)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Automatically remove unused import (#1574)

* Automatically remove unused import

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Disallow partial lists in map tasks (#1577)

* Disallow partial lists in map tasks

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Lint

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Remove duplicate reporting logic (#1578)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* [Core feature] Convert List[Any] to a single pickle file (#1535)

* Convert List[Any] to a single pickle file

* remove redundant code

* keep batchSize only if type contain flytePickle

* fix error

* add batch support to translate_inputs_to_literals

* add ci test

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve comment

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* add more ci test

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* handle HashMethod case

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve format

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* improve

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* add test_is_batchable

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* Add BatchSize

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test_batch_size

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: root <root@ip-172-31-10-58.ec2.internal>
Co-authored-by: Kevin Su <pingsutw@apache.org>

* Improve authoring structure documentation (#1572)

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* Unify sqlalchemy Dockerfiles (#1585)

* Unify sqlalchemy Dockerfiles

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Simplifyte sqlalchemy Dockerfile

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Set pythonpath to /root

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Pyflyte build imageSpec (#1555)

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* pyflyte build

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Support serialize and package

Signed-off-by: Kevin Su <pingsutw@apache.org>

* more tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* move to plugin

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tested

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tested

Signed-off-by: Kevin Su <pingsutw@apache.org>

* more tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add support passing yaml in pyflyte run

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Feature/3506: Improve Type Conversion errors, use rich to prettify error messages (#1582)

* improve input type conversion error

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add tests

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add tests

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add rich

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* remove prototyping script, update loggers

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update __init__.py

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update logger

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update logger

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix GE and pandera tests

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* fix lint

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Add support nested FlyteFile in pyflyte run (#1587)

* Add support nested FlyteFile in pyflyte run

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Set cache_regions=True in the case of s3fs (#1592)

* Set cache_regions=True in the case of s3fs

* Linting

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Fix tests

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* More linting

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Set default ComSpec when running on Windows (#1595)

* set default ComSpec when running on Windows

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Feature: add activate-launchplan command to pyflyte (#1588)

* add activate-launchplan command to pyflyte

Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>

* fix linting

Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>

* rework launchplan command

Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>

* fix linting

Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>

---------

Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>

* nit (#1596)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Add log streaming to papermill plugin (#1129)

* checkpoint

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Experimental implementation works well. Instead of messing with the class, we utilize the interpolation and context to dynamically generate a wrapper around the desired script. In the wrapper, we cd to the ctx.working_directory, export the env variables (handled by an additional method to convert dict to str), and then pass the arguments to the script

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fix spacing

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* remove breakpoint

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Output ctx.working_directory as single output

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* more doc strings

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* more comments

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Added comments

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Add tests and test files from other branch

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* minor fix, have function return the instance rather than create. It seems to get registered when flyte packages your project

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fix tests

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* remove set flags not supported by sh

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fix spellcheck lint errors

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* don't have a windows equivalent test script so bypassing those tests for now

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Add typing to make_export_string_from_env_dict params

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fixup doc string

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Refactored the new behavior out into a new class. Did not change implementation details at all

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Address linter issues and up test cov

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Skip test on windows, no equivalent script

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Run black and isort, address SC2236

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Refactored class name to remove _, make utility function require name so multiple uses in a workflow don't break

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fix utility function

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* fix utility function call in tests

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* Fix isort on test_shell.py

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>

* adding logging settings to papermill plugin

Signed-off-by: Calvin Leather <cleather@embarkvet.com>

* set level to info logging

* improved comments/docs

---------

Signed-off-by: Mike Zhong <mzhong@embarkvet.com>
Signed-off-by: Calvin Leather <cleather@embarkvet.com>
Co-authored-by: Mike Zhong <mzhong@embarkvet.com>
Co-authored-by: Mike Zhong <mike.zhong89@gmail.com>

* Set lower bound for s3fs (#1598)

* set lower bound for s3fs

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update doc-requirements.txt

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Implement horovod task in mpi plugin (#1575)

* Add horovod task to mpi plugin

Signed-off-by: byhsu <byhsu@linkedin.com>

* Remove unused

Signed-off-by: byhsu <byhsu@linkedin.com>

* add test

Signed-off-by: byhsu <byhsu@linkedin.com>

* inherit from mpifunctiontask

Signed-off-by: byhsu <byhsu@linkedin.com>

* fix format

Signed-off-by: byhsu <byhsu@linkedin.com>

* fix format

Signed-off-by: byhsu <byhsu@linkedin.com>

---------

Signed-off-by: byhsu <byhsu@linkedin.com>
Co-authored-by: byhsu <byhsu@linkedin.com>

* add ImageRender class for Flyte Decks (#1599)

Signed-off-by: esad <esad@peridotml.io>

* Bump tensorflow from 2.10.0 to 2.11.1 in /plugins/flytekit-mlflow (#1586)

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.10.0 to 2.11.1.
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](tensorflow/tensorflow@v2.10.0...v2.11.1)

---
updated-dependencies:
- dependency-name: tensorflow
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Handle batchable lists of length 0 (#1600)

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* pyflyte run imperative workflow (#1597)

* pyflyte run imperative workflow

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Backfill should fill in the right input vars (#1593)

* Backfill fix

- Backfill was using incorrect arguments
- backfill should use the argument that user provides or none at all

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Updated code

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fixed unit test

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

---------

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Pyflyte prettified (#1602)

* Pyflyte prettified

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* added dependency

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

---------

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* pyflyte run now supports json/yaml files (#1606)

* pyflyte run now supports json files

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* added yaml support

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fixed parsing

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* fixed windows test

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

---------

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>

* Add is_inside func to ImageSpec (#1601)

* Add is_inside func to ImageSpec

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add is_inside func to ImageSpec

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* rename

Signed-off-by: Kevin Su <pingsutw@apache.org>

* rename

Signed-off-by: Kevin Su <pingsutw@apache.org>

* check execution mode

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Measuring the time taken for each component when executing a task (#1581)

* add time measurement framework

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* add time measurement framework

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* nit

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* nit

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* nit

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* nit

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* nit

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

---------

Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>

* Skip the image building process if the check for its existence fails (#1614)

* Fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* add lru cache

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Enable torch elastic training (torchrun) (#1603)

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Fabio Graetz <fabiograetz@googlemail.com>

This PR brings [torch elastic training (`torchrun`)](https://pytorch.org/docs/stable/elastic/run.html) to the pytorch plugin:

```python
from flytekitplugins.kfpytorch import Elastic

@task(
    task_config=Elastic(
        replicas=4,
        nproc_per_node=4,
        ...
    ),
    ...
)
def train(...):
    ...
```

flyteorg/flyte#3614

* update sphinx youtube plugin (#1619)

* update sphinx youtube plugin

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Read polar dataframe without copying to local (#1618)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add default spark image (#1616)

* Add default spark image

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Lazy load modules (#1590)

* lazy load module

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lazy load module

Signed-off-by: Kevin Su <pingsutw@apache.org>

* import

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* keep structured dataset in flytekit init

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tess

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* move import pandas to __init__

Signed-off-by: Kevin Su <pingsutw@apache.org>

* use lazy import loader instead

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* regular import

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fixed tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Override node name with a name containing underscore (#1608)

* override node_name with a name containing underscore

Signed-off-by: Kevin Su <pingsutw@apache.org>

* add comment

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* External Plugin Service (grpc) (#1524)

External Plugin Service

* Include traceback in errors from admin (#1621)

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Add support for copying all the files in source root  (#1622)

* Add support for copying all the files in source root

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Add tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix PipelineModel transformer issue 3648 (#1623)

Signed-off-by: esad <esad@peridotml.io>

* update GH workflow for external plugin service (#1624)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* [Issue-3617] Enables FlyteFiles, FlyteDirectors, and StructuredDatasets inputs in papermill plugin (#1612)

* Add support for using a list as an input for a subworkflow (#1605)

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Improve task decorator type hints with overload (#1631)

Without the overload, the decorated function does not have the proper type of PythonFunctionTask, leading to spurious type errors when attempting to register the task on a FlyteRemote object

Signed-off-by: Matthew Hoffman <matthew@protopia.ai>

* add annotation option for serialization (#1615)

* add annotation option for serialization

* add support for guessing type

* formatting

* test both directions

* remove guessing for annotated types

* clarify usage in test

* Update tests/flytekit/unit/core/test_type_engine.py

---------

Co-authored-by: Eli Bixby <eli@cradle.bio>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* Delete removed data persistence classes from docs (#1633)

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* Improve workflow decorator type hints with overload (#1635)

Previously, the workflow decorator is hinted as always returning a WorkflowBase, which is not true when _workflow_function is None; similar to #1631, we propose using typing.overload to differentiate the return type of workflow based on the value of _workflow_function

Signed-off-by: Matthew Hoffman <matthew@protopia.ai>

* Fix mypy errors (#1313)

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix mypy errors

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix mypy errors

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* wip

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* fix test

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Update type

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Fix tests

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* update dev-requirements.txt

Signed-off-by: Kevin Su <pingsutw@apache.org>

* Address comment

Signed-off-by: Kevin Su <pingsutw@apache.org>

* upgrade torch

Signed-off-by: Kevin Su <pingsutw@apache.org>

* nit

Signed-off-by: Kevin Su <pingsutw@apache.org>

* lint

Signed-off-by: Kevin Su <pingsutw@apache.org>

---------

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@gmail.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>

* type-protobuf<4

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Allow for python 3.7

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Fix protobuf transformer merge error

* Fix bad merge of s/_task_models/task_models/

* Add maxsize=128 to @lru_cache to cover python 3.7

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Lint

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Regenerate doc-requirements.txt

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Add missing call to lru_cache

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Bump version and regenerate requirements in a few plugins

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Revert "Fix mypy errors (#1313)"

This reverts commit af49155.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* Remove docs

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* More linting

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Fix type of get_type_for_output_var

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Remove tensorflow test_model

* Replace use of shutil.copytree

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Other use of copytree

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Stringify paths in test_script_mode

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* os.path.isfile instead in imagespec

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Remove tests

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* More linting

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

* Remove flaky test

Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>

---------

Signed-off-by: Ketan Umare <ketan.umare@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: eduardo apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
Signed-off-by: Adrian Rumpold <a.rumpold@gmail.com>
Signed-off-by: Felix Ruess <felix.ruess@roboception.de>
Signed-off-by: franco-bocci <franco.bocci@wolt.com>
Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Yicheng-Lu-llll <luyc58576@gmail.com>
Signed-off-by: Artem Petrov <58334441+wckdman@users.noreply.github.com>
Signed-off-by: Mike Zhong <mzhong@embarkvet.com>
Signed-off-by: Calvin Leather <cleather@embarkvet.com>
Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: esad <esad@peridotml.io>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Matthew Hoffman <matthew@protopia.ai>
Signed-off-by: Kevin Su <pingsutw@gmail.com>
Co-authored-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Ketan Umare <16888709+kumare3@users.noreply.github.com>
Co-authored-by: Adrian Rumpold <a.rumpold@gmail.com>
Co-authored-by: Niels Bantilan <niels.bantilan@gmail.com>
Co-authored-by: Felix Ruess <felix.ruess@roboception.de>
Co-authored-by: Franco Bocci <121866694+franco-bocci@users.noreply.github.com>
Co-authored-by: Samhita Alla <aallasamhita@gmail.com>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: ggydush <35151789+ggydush@users.noreply.github.com>
Co-authored-by: ggydush-fn <greg.gydush@freenome.com>
Co-authored-by: Yicheng-Lu-llll <51814063+Yicheng-Lu-llll@users.noreply.github.com>
Co-authored-by: root <root@ip-172-31-10-58.ec2.internal>
Co-authored-by: Artem <58334441+wckdman@users.noreply.github.com>
Co-authored-by: Calvin Leather <cleather@embarkvet.com>
Co-authored-by: Mike Zhong <mzhong@embarkvet.com>
Co-authored-by: Mike Zhong <mike.zhong89@gmail.com>
Co-authored-by: ByronHsu <byronhsu1230@gmail.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
Co-authored-by: peridotml <106936600+peridotml@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Fabio M. Graetz, Ph.D <fabiograetz@googlemail.com>
Co-authored-by: Matthew Hoffman <matthew@protopia.ai>
Co-authored-by: Eli Bixby <elibixby@google.com>
Co-authored-by: Eli Bixby <eli@cradle.bio>
@elibixby
Copy link
Contributor

elibixby commented May 28, 2023

I think with the Annotation functionality I added in #1615 you can accomplish this much more clearly and cleanly without bringing the functionality into TypeEngine, where it adds a lot of complexity, and doesn't even really provide this feature in many cases (for example what if I want to batch types which are already registered in the TypeEngine).

BatchedList = Annotated[List[Any], FlytePickleTransformer()]

@task
def my_task(...) -> BatchedList:
   ...

Or even using a custom transformer with a fixed batch size, and explicit type;


class BatchedList:

    def __class_getitem__(self, type: Type, batch_size: int):
         return  Annotated[List[type],  MyBatchingTransformer(t=type, batch_size=int)]


@task
def my_batching_task(...) -> BatchedList[int, 5E10]:
   ...

@Yicheng-Lu-llll Does this meet your needs?

If so perhaps we can revert this as it adds significant complexity. @wild-endeavor
I'm sorry I didn't catch this before it went in, the Annotation override feature is newer and not well documented.

@Yicheng-Lu-llll
Copy link
Member Author

Yicheng-Lu-llll commented May 28, 2023

Hi @elibixby, Thank you for commenting! We can further discuss about this.

I think with the Annotation functionality I added in #1615 you can accomplish this much more clearly and cleanly without bringing the functionality into TypeEngine, where it adds a lot of complexity, and doesn't even really provide this feature in many cases (for example what if I want to batch types which are already registered in the TypeEngine).

  1. The primary goal of this PR is to address the performance issues. Some users find it is extremely slow if they have large list with type Any. This issue is not apparent with other types. Hence, this PR focuses on batching the List[Any] type to enhance performance, while leaving all other types unaffected.

  2. In my initial implementation,I also change the behavior of get_transformer, However, as you pointed out, , this approach will lose the ability to change the batch size. From a user's perspective, creating a custom transformer to support batch size can be overly complex. Also, adding batch size to FlytePickleTransformer would introduce equivalent complexity.

  3. If we do not care about the batch size, I do strongly agree your approach is much better. But even we choose to disregard batch size and proceed with your suggestions, more modifications will be required beyond add annotation option for serialization #1615. For example, We need to make sure translate_inputs_to_literals and TypeEngine are consistent.

@elibixby
Copy link
Contributor

elibixby commented May 28, 2023

  1. In my initial implementation,I also change the behavior of get_transformer, However, as you pointed out, , this approach will lose the ability to change the batch size. From a user's perspective, creating a custom transformer to support batch size can be overly complex. Also, adding batch size to FlytePickleTransformer would introduce equivalent complexity.

I agree if we were to use annotations, we should provide a convenience wrapper for batch size to users, so they don't need to implement their own type transformer. I disagree that it would introduce equivalent complexity to this PR. It would be a completely isolated subclass of FlytePickleTransformer. (as a side note I am working on a PR right now which alters the way FlytePickleTransformer is used to be a default in get_transformer rather than a special case only for function signatures as implmented in #1565 , I've pulled this into a separate change #1661 to make it clearer what I propose here)

  1. If we do not care about the batch size, I do strongly agree your approach is much better. But even we choose to disregard batch size and proceed with your suggestions, more modifications will be required beyond add annotation option for serialization #1615. For example, We need to make sure translate_inputs_to_literals and TypeEngine are consistent.

This is something I've discussed pretty extensively with @wild-endeavor . I think the preferred solution here is to make it so that adding a TypeTransformer by itself (either through registration or annotation), is sufficient to fully the support the type in Flyte.

The main thrust of this is in supporting pyflyte run (as you point out), but it could fairly easily be supported by adding a from_json to the TypeTransformer base class which pyflyte uses to unmarshal yaml data.

Would you be open to a solution that addresses both of these issues through an Annotated type as I've just described?

@Yicheng-Lu-llll
Copy link
Member Author

Yicheng-Lu-llll commented May 28, 2023

I disagree that it would introduce equivalent complexity to this PR. It would be a completely isolated subclass of FlytePickleTransformer

If no batch size is specified, when a list comes in, the FlytePickleTransformer can simply transform the entire list into a single pickle file. However, with a batch size, things may be different. For example, we can split the list into three, and then convert them into three separate pickle files. But then, how to store and handle it as a literal seems to be a problem. Because it appears we have to change the literal type (from blob to LiteralCollection), otherwise we can not simply deal with the three separate pickle files. This is why I said "may introduce equivalent complexity to the current FlytePickleTransformer". It would make things easier to handle the batching process within the ListTransformer, which naturally supports the LiteralCollection.

Would you be open to a solution that addresses both of these issues through an Annotated type as I've just described?

Overall make sense to me. But I am not sure how will you deal with the above concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants