Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small readme tweaks #937

Merged
merged 1 commit into from
May 28, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,15 @@ Fondant enables you to initialize datasets, apply various operations on them, an

## 💨 Getting Started

Fondant allows you to easily define workflows comprised of both reusable and custom components. The following example uses the reusable load_from_hf_hub component to load a dataset from the Hugging Face Hub and process it using a custom component that will resize the images resulting in a new dataset.
Fondant allows you to easily define workflows comprised of both reusable and custom components. The following example uses the reusable `load_from_hf_hub component` to load a dataset from the Hugging Face Hub and process it using a custom component that will resize the images resulting in a new dataset.


```pipeline.py
import pyarrow as pa

from fondant.dataset import Dataset

# initialize a dataset by loading data from the Hugging Face Hub
raw_data = Dataset.create(
"load_from_hf_hub",
arguments={
Expand All @@ -55,15 +56,15 @@ raw_data = Dataset.create(
"top_level_domain": pa.string(),
},
)

# add an operation to download the images from the urls
images = raw_data.apply(
"download_images",
arguments={
"input_partition_rows": 100,
"resize_mode": "no",
},
)

# add an operation to resize the images
dataset = images.apply(
"resize_images",
arguments={
Expand All @@ -85,13 +86,13 @@ Once you have a pipeline you can easily run (and compile) it by using the built-
fondant run local pipeline.py
```

To see all available runner and arguments you can check the fondant CLI help pages
To see all available runners and arguments you can check the fondant CLI help pages

```bash
fondant --help
```

Or for a subcommand:
Or for a specific subcommand:

```bash
fondant <subcommand> --help
Expand All @@ -115,10 +116,10 @@ fondant <subcommand> --help
Here's what Fondant brings to the table:
- 🔧 Plug ‘n’ play composable data processing workflows
- 🧩 Library containing off-the-shelf reusable components
- 🐼 A simple Pandas based interface for creating custom components
- 🐼 A simple Pandas based dataframe interface for creating custom components
- 📊 Built-in lineage, caching, and data explorer
- 🚀 Production-ready, scalable deployment
- ☁️ Integration with runners across different clouds (Vertex, Sagemaker, Kubeflow)
- ☁️ Integration with runners across different clouds (Vertex on Google Cloud, Sagemaker on AWS, Kubeflow on any k8s cluster)

👉 **Check our [Component Hub](https://fondant.ai/en/latest/components/hub/) for an overview of all
available components**
Expand All @@ -141,7 +142,7 @@ An end-to-end Fondant pipeline that starts from our Fondant-CC-25M creative comm

## ⚒️ Installation

First, run the minimal Fondant installation:
First, run the basic Fondant installation:

```
pip install fondant
Expand Down
Loading