Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large scale controlnet #260

Merged
merged 14 commits into from
Jul 6, 2023
Merged

Large scale controlnet #260

merged 14 commits into from
Jul 6, 2023

Conversation

PhilippeMoussalli
Copy link
Contributor

@PhilippeMoussalli PhilippeMoussalli commented Jul 3, 2023

PR for running the controlnet pipeline end-to-end on KFP.

Some observations when doing the pipeline testing:

  • Tested with @ChristiaensBert VM and it runs really nice and much faster than the public clip service.
  • I could not test everything end to end locally since the GPU component are difficult to run locally -> switched to KFP to leverage the GPU VMs
  • I had to rebuild images using the build and tag images in the scripts folder. I think we still need to modify the script to enable only building specified components since it currently default to all components in the components directory which might take some time to build
  • The local runner does not seem to do the subset checking yet and we still need to expand the CLI to be able to use the kfp runner (currently not supported). Although the CLI is really nice overall :)
  • Pipeline runs fine and writes the dataset to the hub but fails at the end since it expects an output manifest. This can be resolved with this ticket. We should prioritize this.

Notes:

  • Changed the segmentation to output a segmentation image instead of a segmentation array since that's the output expected for controlnet training

Things to do:

  • Estimate how much the job would cost

@PhilippeMoussalli PhilippeMoussalli self-assigned this Jul 3, 2023
@PhilippeMoussalli PhilippeMoussalli added the Components Implementation of components label Jul 3, 2023
@PhilippeMoussalli PhilippeMoussalli linked an issue Jul 3, 2023 that may be closed by this pull request
@GeorgesLorre
Copy link
Collaborator

GeorgesLorre commented Jul 4, 2023

The local runner does not seem to do the subset checking yet and we still need to expand the CLI to be able to use the kfp runner (currently not supported). Although the CLI is really nice overall :)

yes we still need to dissect the current pipeline.py

url:
description: The url of the backend clip retrieval service, defaults to the public service
type: str
default: https://knn.laion.ai/knn-service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like the main.py script has a default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults defined here translate internally to defaults defined in the argparser since kfp always requires a given provided argument if specified and cannot be empty.

parser.add_argument("--url", default="https://knn.laion.ai/knn-service")

The values defined in the argument parser generally take precedence over the default values defined in the main.py file so adding them there can be a bit misleading (e.g. if the user attempts to change them, the default values won't be used).

@PhilippeMoussalli PhilippeMoussalli changed the title Large scale controlnet [WIP] Large scale controlnet Jul 6, 2023
@PhilippeMoussalli PhilippeMoussalli force-pushed the large-scale-controlnet branch from 19a9422 to c81400c Compare July 6, 2023 13:17
@PhilippeMoussalli PhilippeMoussalli force-pushed the large-scale-controlnet branch from c81400c to 858701e Compare July 6, 2023 13:18
@@ -1,6 +1,6 @@
name: Download images
description: Component that downloads images based on URLs
image: ghcr.io/ml6team/download_images:dev
image: ghcr.io/ml6team/download_images:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
image: ghcr.io/ml6team/download_images:latest
image: ghcr.io/ml6team/download_images:dev

The images on main should be fixed to dev, which corresponds to the latest main version. latest corresponds to the latest release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right forget to revert this back!


return color_seg
crop_bytes = io.BytesIO()
image.save(crop_bytes, format="JPEG")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually save the image to disk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this just saves it to crop_bytes which is a BytesIO object (in-memory buffer to store the image in binary format)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense, thanks!

@PhilippeMoussalli PhilippeMoussalli changed the title [WIP] Large scale controlnet Large scale controlnet Jul 6, 2023
@PhilippeMoussalli PhilippeMoussalli merged commit 7f27124 into main Jul 6, 2023
@PhilippeMoussalli PhilippeMoussalli deleted the large-scale-controlnet branch July 6, 2023 14:33
Hakimovich99 pushed a commit that referenced this pull request Oct 16, 2023
PR for running the controlnet pipeline end-to-end on KFP. 

Some observations when doing the pipeline testing: 

- Tested with @ChristiaensBert VM and it runs really nice and much
faster than the public clip service.
- I could not test everything end to end locally since the GPU component
are difficult to run locally -> switched to KFP to leverage the GPU VMs
- I had to rebuild images using the build and tag images in the
`scripts` folder. I think we still need to modify the script to enable
only building specified components since it currently default to all
components in the `components` directory which might take some time to
build
- The local runner does not seem to do the subset checking yet and we
still need to expand the CLI to be able to use the kfp runner (currently
not supported). Although the CLI is really nice overall :)
- Pipeline runs fine and writes the dataset to the hub but fails at the
end since it expects an output manifest. This can be resolved with this
[ticket](#221). We should
prioritize this.

Notes:
- Changed the segmentation to output a segmentation image instead of a
segmentation array since that's the output expected for controlnet
training

Things to do: 
- Estimate how much the job would cost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Components Implementation of components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run Controlnet use case at scale with custom LAION backend
4 participants