Skip to content

Commit

Permalink
Add dataset link to press release (#472)
Browse files Browse the repository at this point in the history
  • Loading branch information
mrchtr authored Sep 28, 2023
1 parent 6a2fd0e commit c8bd3b7
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/announcements/CC_25M_community.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@ execution environments, shared within the community.
A current challenge for generative AI is compliance with copyright laws. For this reason,
Fondant has developed a data-processing pipeline to create a 500-million dataset of Creative
Commons images to train a latent diffusion image generation model that respects copyright. Today,
as a first step, we are releasing a 25-million sample dataset and invite the open source
as a first step, we are releasing
a [25-million sample dataset](https://huggingface.co/datasets/fondant-ai/fondant-cc-25m) and invite
the open source
community to collaborate on further refinement steps.

Fondant offers tools to download, explore and process the data. The current example pipeline
Expand Down

0 comments on commit c8bd3b7

Please sign in to comment.