-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alpha pipeline] Add image retrieval component #110
Conversation
# TODO remove, just use a tiny df for testing purposes | ||
data = { | ||
"prompts_text": [ | ||
"comfortable bathroom, art deco interior design", | ||
"comfortable bathroom, bauhaus interior design", | ||
] | ||
} | ||
pandas_df = pd.DataFrame.from_dict(data) | ||
df = dd.from_pandas(pandas_df, npartitions=1) | ||
# end of TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would keep this as long as we're building the pipeline. Would remove once the pipeline is finished
@@ -1,2 +1,2 @@ | |||
git+https://github.com/ml6team/fondant.git | |||
git+https://github.com/ml6team/fondant.git@2abeabe07412266b78e3b2a4055e0b10d62168cb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guilty of this myself but don't forget to remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Niels :) Looks good! I left a few comments.
How did you manage to resolve the issues with slow retrieval eventually?
@@ -140,7 +141,9 @@ def write_index(self, df: dd.DataFrame): | |||
) | |||
|
|||
# Write index | |||
dd.compute(upload_index_task) | |||
with ProgressBar(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, something similar to tqdm with an ETA? does it show up nicely in the kfp logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will depend on the machine defined for the component, but often quite a few I think. And also multiple machines in the future.
@@ -0,0 +1,2 @@ | |||
git+https://github.com/ml6team/fondant.git@2abeabe07412266b78e3b2a4055e0b10d62168cb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
examples/pipelines/controlnet-interior-design/components/retrieve_images/fondant_component.yaml
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/fondant_component.yaml
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/config/components_config.py
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NielsRogge!
examples/pipelines/controlnet-interior-design/components/retrieve_images/Dockerfile
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/clip_client.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/src/main.py
Outdated
Show resolved
Hide resolved
examples/pipelines/controlnet-interior-design/components/retrieve_images/fondant_component.yaml
Outdated
Show resolved
Hide resolved
@@ -140,7 +141,9 @@ def write_index(self, df: dd.DataFrame): | |||
) | |||
|
|||
# Write index | |||
dd.compute(upload_index_task) | |||
with ProgressBar(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will depend on the machine defined for the component, but often quite a few I think. And also multiple machines in the future.
As discussed before, I would make these reusable components available in a |
# end of TODO | ||
|
||
# add id and source columns | ||
df["id"] = df.assign(id=1).id.cumsum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the LAION ids here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I leave this for a follow-up PR? Would like to first get all components up and running.
There's no real benefit of having the LAION id's over integer indices that go from 0 to length of the dataset for this pipeline
This PR adds the image retrieval component, which retrieves image URLs using clip-retrieval from LAION-5B. Fixes #95 --------- Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
This PR adds the image retrieval component, which retrieves image URLs using clip-retrieval from LAION-5B.
Fixes #95