0.12.0
⚡️ Introducing the dataset-first interface
We have removed the pipeline interface and redesigned the dataset class. Datasets can still be built using load components as before. Now, you have to use the Dataset
class instead of the Pipeline
.
from fondant.dataset import Dataset
dataset = Dataset.create(
"load_from_parquet",
arguments={
...
},
)
dataset = dataset.apply(...)
Additionally, we now support initializing datasets from previous workflow runs, which allows you to share your Fondant datasets. Datasets can be initialized using manifests. To share a dataset, you can easily share manifest files.
from fondant.dataset import Dataset
dataset = Dataset.read("gs://.../manifest.json")
dataset = dataset.apply(...)
🛠️ Working directory
Since the pipeline doesn’t exist anymore, we added a new cli command to define a working directory. In the working directory all the workflow related artifacts will be stored.
fondant run local dataset --working-directory ./data
Fondant pipelines created with previous Fondant versions are no longer compatible with >=0.12.0. To migrate your existing pipelines, initialize your dataset using Dataset.create(...)
instead of Pipeline.read(...)
and use the former base_path
as the working directory when you materialize your dataset.
What's Changed
- Refactor pipeline interface by @mrchtr in #901
- Update dataset documentation by @mrchtr in #918
- Remove pipeline references by @mrchtr in #923
- Update documentation dataset first interface by @mrchtr in #921
- Empty produces leading into list index out of range by @mrchtr in #924
- Remove working directory from user arguments by @mrchtr in #925
- Fix navigation documentation by @mrchtr in #926
- Fix link in the README file by @Philmod in #930
- Update readme with dataset focus by @GeorgesLorre in #928
- Mount absolute path of working dir to local runner by @mrchtr in #931
- Fixing cicd by @mrchtr in #929
- Fix arch link in readme by @GeorgesLorre in #933
- Set session duration to 5h in prep release pipeline by @mrchtr in #934
New Contributors
Full Changelog: 0.11.2...0.12.0