Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fondant install command in the readme #833

Merged
merged 2 commits into from
Feb 2, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 17 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,13 +107,21 @@ An end-to-end Fondant pipeline that starts from our Fondant-CC-25M creative comm

## ⚒️ Installation

Fondant can be installed using pip:
First, run the minimal Fondant installation:

```
pip install fondant
```

For more detailed installation options, check the [**installation page**](https://fondant.ai/en/latest/guides/installation/) on our documentation.
Fondant also includes extra dependencies for specific runners, storage integrations and publishing
components to registries.
We can install the local runner to enable local pipeline execution:

```
pip install fondant[docker]
```

For more detailed installation options, check the [**installation page**](https://fondant.ai/en/latest/guides/installation/)on our documentation.


## 👨‍💻 Usage
Expand All @@ -126,10 +134,10 @@ to load a dataset from the Hugging Face Hub and process it using a custom compon

**_pipeline.py_**
```python
from fondant.pipeline import Pipeline

from fondant.pipeline import Pipeline

pipeline = Pipeline(pipeline_name="example pipeline", base_path="fs://bucket")
pipeline = Pipeline(name="example pipeline", base_path="./data")

dataset = pipeline.read(
"load_from_hf_hub",
Expand All @@ -139,71 +147,16 @@ dataset = pipeline.read(
)

dataset = dataset.apply(
"components/custom_component",
"resize_images",
arguments={
"min_width": 600,
"min_height": 600,
"resize_width": 128,
"resize_height": 128,
},
)
```

#### Component

To create a custom component, you first need to describe its contract as a yaml specification.
It defines the data consumed and produced by the component and any arguments it takes.

**_fondant_component.yaml_**
```yaml
name: Custom component
description: This is a custom component
image: custom_component:latest

consumes:
image:
type: binary

produces:
caption:
type: utf8

args:
argument1:
description: An argument passed to the component at runtime
type: str
argument2:
description: Another argument passed to the component at runtime
type: str
```

Once you have your component specification, all you need to do is implement a constructor
and a single `.transform` method and Fondant will do the rest. You will get the data defined in
your specification partition by partition as a Pandas dataframe.

**_component/src/main.py_**
```python
import pandas as pd
from fondant.component import PandasTransformComponent


class ExampleComponent(PandasTransformComponent):

def __init__(self, *, argument1, argument2) -> None:
"""
Args:
argumentX: An argument passed to the component
"""
# Initialize your component here based on the arguments

def transform(self, dataframe: pd.DataFrame) -> pd.DataFrame:
"""Implement your custom logic in this single method
Args:
dataframe: A Pandas dataframe containing the data
Returns:
A pandas dataframe containing the transformed data
"""
```

For more advanced use cases, you can use the `DaskTransformComponent` instead.
Custom use cases require the creation of custom components. Check out our [getting started page](https://fondant.ai/en/latest/guides/first_pipeline/) to learn
more about how to build custom pipelines and components.

### Running your pipeline

Expand Down
Loading