0.10.0
🪶 Lightweight components to easily develop and iterate new components
We now support building lightweight components. This is currently the easiest way to get you started in building your own custom components. Lightweight components remove the need to specifying custom files for building components (requirements, Dockerfile, component specification) compared to containerized components.
import pandas as pd
import pyarrow as pa
from fondant.component import PandasTransformComponent
from fondant.pipeline import lightweight_component
@lightweight_component(produces={"z": pa.int32()})
class AddNumber(PandasTransformComponent):
def __init__(self, n: int):
self.n = n
def transform(self, dataframe: pd.DataFrame) -> pd.DataFrame:
dataframe["z"] = dataframe["x"].map(lambda x: x + self.n)
return dataframe
Lightweight Components are constructed by decorating Python functions with the @lightweight_component decorator. The decorator transforms your function into a Fondant components where they can be run on both local and remote runners. 🚀
Some of the benefits of those components are:
⏩ Reduced development efforts
Decrease the amount of work needed to develop a component, this is especially relevant for simpler components that perform simple tasks (e.g., filtering a column on a certain value).
🔄 Accelerated iterations
With the component script integrated inline within your code, the development and iteration process becomes significantly faster.
🛠️ Customization
Despite their lightweight nature, these components remain flexible. Users can still customize them as needed by incorporating extra requirements, specifying a custom image, and more.
Checkout our new guide for more details.
🚥 RAG Component updates
- Added support to create embeddings using an external module instead of having to provide your own embeddings. More info here
- Enabledr hybrid search and reranking to the weaviate retrieve component
What's Changed
- Support applying Lightweight Python components in Pipeline SDK by @GeorgesLorre in #770
- Add support to run lightweight python components in docker runner by @RobbeSneyders in #786
- Enable testing index Weaviate by @PhilippeMoussalli in #790
- Cleanup and add more tests by @GeorgesLorre in #792
- Make embeddings optional in weaviate component by @PhilippeMoussalli in #791
- Integrate argument inference by @RobbeSneyders in #788
- Enable hybrid search by @PhilippeMoussalli in #794
- Enable reranking by @PhilippeMoussalli in #796
- Bump kfp version and enable python 3.11 by @GeorgesLorre in #800
- Support lightweight Python components on Sagemaker by @RobbeSneyders in #804
- Validation for lightweight components by @mrchtr in #793
- Update caching arguments to include
ComponentOp
and theImage
class by @PhilippeMoussalli in #802 - Feature/kfp support for lightweight components by @GeorgesLorre in #803
- Add initial docs on Python Components by @PhilippeMoussalli in #812
- Add fondant base image by @mrchtr in #801
- Update getting started guide by @mrchtr in #816
- Update lightweight docs by @PhilippeMoussalli in #817
- Simplify component init interface by @PhilippeMoussalli in #819
- Enable build of fondant dev base image by @mrchtr in #818
- Simplify component naming by @GeorgesLorre in #815
- Enable write components to cache by @PhilippeMoussalli in #814
- Remove datacomp pipeline reference by @mrchtr in #822
- Fix Readme script generation by @GeorgesLorre in #821
- Fix component readme generation by @GeorgesLorre in #828
- Use fondant dev image if fondant dev version is installed by @mrchtr in #820
- Start from dataset schema for lightweight python component
consumes
by @RobbeSneyders in #789 - Update lightweight docs by @PhilippeMoussalli in #827
- Add produces to lightweight component by @PhilippeMoussalli in #829
Full Changelog: 0.9.0...0.10.0