Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/kfp support for lightweight components #803

Merged
merged 3 commits into from
Jan 23, 2024

Conversation

GeorgesLorre
Copy link
Collaborator

Tested successfully on vertex

TODO:

  • test on kfp
  • add compile tests that have lightweight components
  • investigate duplicate data dir's

@GeorgesLorre
Copy link
Collaborator Author

FYI the pipeline I ran on vertex:

import dask.dataframe as dd
import pandas as pd
import pyarrow as pa
from fondant.component import DaskLoadComponent, PandasTransformComponent
from fondant.pipeline import Pipeline, lightweight_component

pipeline = Pipeline(
    name="lightweight-pipeline",
    base_path="./data",
)


@lightweight_component(
    base_image="python:3.10",
    extra_requires=[
        "fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@main"
    ],
)
class CreateData(DaskLoadComponent):
    def load(self) -> dd.DataFrame:
        df = pd.DataFrame(
            {
                "x": [1, 2, 3],
                "y": [4, 5, 6],
            },
            index=pd.Index(["a", "b", "c"], name="id"),
        )
        return dd.from_pandas(df, npartitions=1)


dataset = pipeline.read(
    ref=CreateData,
    produces={"x": pa.int32(), "y": pa.int32()},
)


@lightweight_component(
    base_image="python:3.10",
    extra_requires=[
        "fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@main"
    ],
)
class AddN(PandasTransformComponent):
    def __init__(self, n: int = 1, **kwargs):
        self.n = n

    def transform(self, dataframe: pd.DataFrame) -> pd.DataFrame:
        dataframe["x"] = dataframe["x"].map(lambda x: x + self.n)
        return dataframe


_ = dataset.apply(
    ref=AddN,
    produces={"x": pa.int32(), "y": pa.int32()},
    consumes={"x": pa.int32(), "y": pa.int32()},
)


if __name__ == "__main__":
    from fondant.pipeline.runner import VertexRunner

    pipeline.base_path = "REDACTED"

    runner = VertexRunner(
        project_id="REDACTED",
        region="europe-west1",
        service_account="REDACTED",
    )
    runner.run(input=pipeline)

@@ -318,10 +318,6 @@ def default_arguments(self) -> t.Dict[str, Argument]:
),
}

@property
def kubeflow_specification(self) -> "KubeflowComponentSpec":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm contemplating removing the KubeflowComponentSpec from the component_spec.py and integrating it more in the compiler (the only place where it is used)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please

Copy link
Contributor

@PhilippeMoussalli PhilippeMoussalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Geroges!

Are the ToDos still to be tackled in this PR?

@@ -318,10 +318,6 @@ def default_arguments(self) -> t.Dict[str, Argument]:
),
}

@property
def kubeflow_specification(self) -> "KubeflowComponentSpec":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please

@GeorgesLorre
Copy link
Collaborator Author

Thanks Geroges!

Are the ToDos still to be tackled in this PR?

Yes I will do them

@GeorgesLorre GeorgesLorre force-pushed the feature/kfp-support-for-lightweight-components branch from dca8072 to 94b47ff Compare January 22, 2024 15:59
Copy link
Contributor

@PhilippeMoussalli PhilippeMoussalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Georges!

Copy link
Contributor

@mrchtr mrchtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

@GeorgesLorre GeorgesLorre merged commit f5d5d4c into main Jan 23, 2024
11 checks passed
@GeorgesLorre GeorgesLorre deleted the feature/kfp-support-for-lightweight-components branch January 23, 2024 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants