Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update caching arguments to include ComponentOp and the Image class #802

Merged
merged 2 commits into from
Jan 23, 2024

Conversation

PhilippeMoussalli
Copy link
Contributor

@PhilippeMoussalli PhilippeMoussalli commented Jan 19, 2024

PR that updates the cache key estimation to take into account the ComponentSpec instead of the ComponentOp (operation op has both the component spec and the inner produces and consumes as well as the Image

Tested on the example pipeline and caching seems to work fine for lightweight component, only issue is that it will always download the requirements (fondant + extra user dependencies) before skipping the execution of the component. This can become frustrating for iterative development with an increase of dependencies and components.

We might be able to mitigate this by providing base images with pre-installed Fondant version and move the installation of the extras after the cache hit estimation. Best to tackle this in a separate PR. Issue ticket here

The PR also include some small changes to function names to best reflect what they do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the parquet file to include more rows since I was running into issue with the small dataset (probably due to empty partitions)

@@ -94,8 +94,8 @@ def __init__(
image: str,
*,
description: t.Optional[str] = None,
consumes: t.Optional[t.Dict[str, t.Union[str, pa.DataType]]] = None,
produces: t.Optional[t.Dict[str, t.Union[str, pa.DataType]]] = None,
consumes: t.Optional[t.Dict[str, t.Union[str, pa.DataType, bool]]] = None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to accommodate for 'additional_field=True`

Copy link
Collaborator

@GeorgesLorre GeorgesLorre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx ! We'll see about the requirements in another PR

@PhilippeMoussalli PhilippeMoussalli merged commit c9b125d into main Jan 23, 2024
8 of 9 checks passed
@PhilippeMoussalli PhilippeMoussalli deleted the update-caching-arguments branch January 23, 2024 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants