Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move container registry to DockerHub #514

Merged
merged 9 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ on:
push:
branches:
- main
workflow_dispatch:

jobs:
docker:
Expand All @@ -17,13 +18,18 @@ jobs:

- name: Set buildx alias
run: docker buildx install

- name: Install docker pushrm
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
run: |
sudo wget https://github.com/christian-korneck/docker-pushrm/releases/download/v1.9.0/docker-pushrm_linux_amd64 -O /usr/libexec/docker/cli-plugins/docker-pushrm
sudo chmod +x /usr/libexec/docker/cli-plugins/docker-pushrm
docker pushrm --help

- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}

- name: Build components
run: ./scripts/build_components.sh --cache -t $GITHUB_SHA -t dev
10 changes: 7 additions & 3 deletions .github/workflows/prep-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,16 @@ jobs:
- name: Set buildx alias
run: docker buildx install

- name: Install docker pushrm
run: |
sudo wget https://github.com/christian-korneck/docker-pushrm/releases/download/v1.9.0/docker-pushrm_linux_amd64 -O /usr/libexec/docker/cli-plugins/docker-pushrm
sudo chmod +x /usr/libexec/docker/cli-plugins/docker-pushrm

- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}

- name: Build components
run: ./scripts/build_components.sh -t $GITHUB_REF_NAME
Expand Down
10 changes: 7 additions & 3 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,16 @@ jobs:
- name: Set buildx alias
run: docker buildx install

- name: Install docker pushrm
run: |
sudo wget https://github.com/christian-korneck/docker-pushrm/releases/download/v1.9.0/docker-pushrm_linux_amd64 -O /usr/libexec/docker/cli-plugins/docker-pushrm
sudo chmod +x /usr/libexec/docker/cli-plugins/docker-pushrm

- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_ACCESS_TOKEN }}

- name: Tag components
run: ./scripts/tag_components.sh -o $GITHUB_REF_NAME -n latest
Expand Down
2 changes: 1 addition & 1 deletion components/caption_images/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Caption images
description: This component captions images using a BLIP model from the Hugging Face hub
image: ghcr.io/ml6team/caption_images:dev
image: fndnt/caption_images:dev

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion components/download_images/fondant_component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ description: |
[resizer](https://github.com/rom1504/img2dataset/blob/main/img2dataset/resizer.py) function
from the img2dataset library.

image: ghcr.io/ml6team/download_images:dev
image: fndnt/download_images:dev

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion components/embed_images/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Embed images
description: Component that generates CLIP embeddings from images
image: ghcr.io/ml6team/embed_images:dev
image: fndnt/embed_images:dev

consumes:
images:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Embedding based LAION retrieval
description: |
This component retrieves image URLs from LAION-5B based on a set of CLIP embeddings. It can be
used to find images similar to the embedded images / captions.
image: ghcr.io/ml6team/embedding_based_laion_retrieval:dev
image: fndnt/embedding_based_laion_retrieval:dev

consumes:
embeddings:
Expand Down
2 changes: 1 addition & 1 deletion components/filter_comments/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter comments
description: Component that filters code based on the code to comment ratio
image: ghcr.io/ml6team/filter_comments:dev
image: fndnt/filter_comments:dev

consumes:
code:
Expand Down
2 changes: 1 addition & 1 deletion components/filter_image_resolution/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter image resolution
description: Component that filters images based on minimum size and max aspect ratio
image: ghcr.io/ml6team/filter_image_resolution:dev
image: fndnt/filter_image_resolution:dev

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion components/filter_line_length/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter line length
description: Component that filters code based on line length
image: ghcr.io/ml6team/filter_line_length:dev
image: fndnt/filter_line_length:dev

consumes:
code:
Expand Down
2 changes: 1 addition & 1 deletion components/image_cropping/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: Image cropping
image: ghcr.io/ml6team/image_cropping:dev
image: fndnt/image_cropping:dev
description: |
This component crops out image borders. This is typically useful when working with graphical
images that have single-color borders (e.g. logos, icons, etc.).
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Image resolution extraction
description: Component that extracts image resolution data from the images
image: ghcr.io/ml6team/image_resolution_extraction:dev
image: fndnt/image_resolution_extraction:dev

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion components/language_filter/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter languages
description: A component that filters text based on the provided language.
image: ghcr.io/ml6team/filter_language:latest
image: fndnt/filter_language:latest

consumes:
text:
Expand Down
2 changes: 1 addition & 1 deletion components/load_from_files/fondant_component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Load from files
description: |
This component loads data from files in a local or remote (AWS S3, Azure Blob storage, GCS)
location. It supports the following formats: .zip, gzip, tar and tar.gz.
image: ghcr.io/ml6team/load_from_files:dev
image: fndnt/load_from_files:dev

produces:
file:
Expand Down
2 changes: 1 addition & 1 deletion components/load_from_hf_hub/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:dev
image: fndnt/load_from_hf_hub:dev

produces:
dummy_variable: #TODO: fill in here
Expand Down
2 changes: 1 addition & 1 deletion components/load_from_parquet/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load from parquet
description: Component that loads a dataset from a parquet uri
image: ghcr.io/ml6team/load_from_parquet:dev
image: fndnt/load_from_parquet:dev

produces:
dummy_variable: #TODO: fill in here
Expand Down
2 changes: 1 addition & 1 deletion components/minhash_generator/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: MinHash generator
description: A component that generates minhashes of text.
image: ghcr.io/ml6team/minhash_generator:latest
image: fndnt/minhash_generator:latest

consumes:
text:
Expand Down
2 changes: 1 addition & 1 deletion components/pii_redaction/fondant_component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ description: |
PII is replaced by random data which is stored in the `replacements.json` file.
A component that detects and redacts Personal Identifiable Information (PII) from
code.
image: ghcr.io/ml6team/pii_redaction:dev
image: fndnt/pii_redaction:dev

consumes:
code:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: |
the prompt sentences and the captions in the LAION dataset.

This component doesn’t return the actual images, only URLs.
image: ghcr.io/ml6team/prompt_based_laion_retrieval:dev
image: fndnt/prompt_based_laion_retrieval:dev

consumes:
prompts:
Expand Down
2 changes: 1 addition & 1 deletion components/segment_images/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Segment images
description: Component that creates segmentation masks for images using a model from the Hugging Face hub
image: ghcr.io/ml6team/segment_images:dev
image: fndnt/segment_images:dev

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion components/text_length_filter/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter text length
description: A component that filters out text based on their length
image: ghcr.io/ml6team/filter_text_length:latest
image: fndnt/filter_text_length:latest

consumes:
text:
Expand Down
2 changes: 1 addition & 1 deletion components/text_normalization/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: Normalize text
image: ghcr.io/ml6team/text_normalization:latest
image: fndnt/text_normalization:latest
description: |
This component implements several text normalization techniques to clean and preprocess textual
data:
Expand Down
2 changes: 1 addition & 1 deletion components/write_to_hf_hub/fondant_component.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Write to hub
description: Component that writes a dataset to the hub
image: ghcr.io/ml6team/write_to_hf_hub:dev
image: fndnt/write_to_hf_hub:dev

consumes:
dummy_variable: #TODO: fill in here
Expand Down
4 changes: 2 additions & 2 deletions docs/components/generic_component.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ The component specification can be modified as follows
```yaml
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:latest
image: fndnt/load_from_hf_hub:latest

consumes:
images:
Expand Down Expand Up @@ -100,7 +100,7 @@ If we want to write this dataset to a Hugging Face Hub location, we can use the
```yaml
name: Write to hub
description: Component that writes a dataset to the hub
image: ghcr.io/ml6team/write_to_hf_hub:latest
image: fndnt/write_to_hf_hub:latest

consumes:
images:
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/build_a_simple_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Create a folder `component/load_from_hub` and create a `fondant_component.yaml`
```yaml
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:dev
image: fndnt/load_from_hf_hub:dev

produces:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Extract image licenses from warc
description: A component that extracts images and their licenses from warc files
image: ghcr.io/ml6team/extract_images_from_warc:d4619b5
image: fndnt/extract_images_from_warc:dev

consumes:
warc:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Common crawl download component
description: A component that downloads parts of the common crawl
image: ghcr.io/ml6team/read_warc_paths:57404ff
image: fndnt/read_warc_paths:dev

produces:
warc:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Generate prompts
description: Component that generates a set of seed prompts
image: ghcr.io/ml6team/generate_prompts:dev
image: fndnt/generate_prompts:dev

produces:
prompts:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Write to hub
description: Component that writes a dataset to the hub
image: ghcr.io/ml6team/write_to_hf_hub:latest
image: fndnt/write_to_hf_hub:latest

consumes:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Add CLIP score
description: Component that adds the CLIP score
image: ghcr.io/ml6team/add_clip_score:dev
image: fndnt/add_clip_score:dev

consumes:
embeddings:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Clean captions
description: Component that filters out bad captions (Empty captions, Captions with weird characters, Captions that are dates)
image: ghcr.io/ml6team/clean_captions:50f3a97878ac81670ebe624039ff0fcec0542e4f
image: fndnt/clean_captions:50f3a97878ac81670ebe624039ff0fcec0542e4f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I can't see on docker hub. I think the example pipeline images are not pushed to docker hub?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they are not built automatically. But the updated build script will push them there for you. So it's the same flow as before.


consumes:
text:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Cluster embeddings
description: Component that applies k-means clustering on subsampled image embeddings
image: ghcr.io/ml6team/cluster_image_embeddings:latest
image: fndnt/cluster_image_embeddings:latest

consumes:
image:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Detect text
description: Component that detects text in images using an mmocr model
image: ghcr.io/ml6team/detect_text:dev
image: fndnt/detect_text:dev

consumes:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter CLIP score
description: Component that filters out bad captions (Empty captions, Captions with weird characters, Captions that are dates)
image: ghcr.io/ml6team/filter_clip_score:dev
image: fndnt/filter_clip_score:dev

consumes:
imagetext:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Filter text complexity
description: Component that filters text based on their dependency parse complexity and number of actions
image: ghcr.io/ml6team/filter_text_complexity:dev
image: fndnt/filter_text_complexity:dev

consumes:
text:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:dev
image: fndnt/load_from_hf_hub:dev

produces:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Mask images
description: Component that masks images based on bounding boxes
image: ghcr.io/ml6team/mask_images:dev
image: fndnt/mask_images:dev

consumes:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:latest
image: fndnt/load_from_hf_hub:latest

produces:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load from hub
description: Component that loads a dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:latest
image: fndnt/load_from_hf_hub:latest

produces:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Write to hub
description: Component that writes a dataset to the hub
image: ghcr.io/ml6team/write_to_hf_hub:latest
image: fndnt/write_to_hf_hub:latest

consumes:
images:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: Load code dataset from hub
description: Component that loads the stack dataset from the hub
image: ghcr.io/ml6team/load_from_hf_hub:latest
image: fndnt/load_from_hf_hub:latest

produces:
code:
Expand Down
Loading
Loading