Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLOOB and KELIP #14

Merged
merged 16 commits into from
May 1, 2022
Merged

Add CLOOB and KELIP #14

merged 16 commits into from
May 1, 2022

Conversation

apolinario
Copy link
Collaborator

Adding CLOOB and KELIP loaders

@apolinario apolinario requested a review from dmarx April 27, 2022 11:09
@apolinario
Copy link
Collaborator Author

Forgot register_model. Will add to the PR

@apolinario apolinario marked this pull request as draft April 27, 2022 11:40
"""
Returns the MMC associated with this loader.
"""
from cloob_training import model_pt, pretrained
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to push a separate PR to kat's repo to make cloob_training installable first, yeah? Then we can just add the cloob_training installation to the pyproject.toml

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that from Kat's repo, but do you feel this is a scalable approach?

I just wonder whether it should be a MMC requirement that every CLIP-like module we support is installable or if we should have support a custom way way to load it, maybe with cloning repos.

Copy link
Owner

@dmarx dmarx Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a really good point. maybe we could even set it up in such a way that it could check to see if something we want is already "installed" on the PATH variable? feels sorta like an accident waiting happen. how about this: the logic for cloning the repo and all that has to be in a loading class like this.

here's how I want this thing ultimately to work. imagine I have never even touched a pytorch model in my life. I just heard about CLIP and I want to play with it.

  1. I install this repo. right now, I think this looks something like

    git clone https://github.com/dmarx/Multi-Modal-Comparators/;
    cd Multi-Modal-Comparators
    pip install poetry
    poetry build
    pip install dist/mmc_*.whl

  2. I import a generic api interface and blindly play with it

    import mmc
    avilable = mmc.available_models()
    perceptor = mmc.load(available[0])
    x,y = perceptor.encode_text('foo bar'), perceptor.encode_image('path/to/file.png')
    score = perceptor.compare(x,y)

my goal here is for CLOOB and KELIP to be available to the user immediately following the setup of this library.

I'm totally open to having some sort of standard pattern we can wrap for dependencies that aren't trivially installable. But I want abstracting away the setup logic and abstracting away esoteric usage api's to be the value this library offers.

Here's an idea: I've already been using a hidden .cache folder for checkpoints, let's designate a subfolder for dependencies that need to be cloned but can't be installed. there's a tool called poe that integrates with poetry as a task runner which could be useful to look at.

does that make sense? I hope I'm not trying to boil the ocean here, am I? This would definitely be more scalable but also feels unstable.

at the very least, we should add like a shell script or something the user can execute as part of the installation that does the git clone, PATH modification, etc. steps

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this repo sort of as a joke to myself, but maybe we should turn it into a real thing? all that business above with taking care of cloning and all that into a subfolder then playing with the path variables etc: abstract all of that away into its own piece. easy way to maintain and leverage the "best practices" for ... this particular brand of "duct tape and bubblegum"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You raised very good points and I think those make sense. Although at the end we may want to go back to the original issue and see if what we are doing is more work per package than making a fork/PR for each tool we want to support we might go back to that at the end. But I don't think it is.

I checked https://github.com/nat-n/poethepoet and I feel it makes sense to me. We could script each import with git clone and path and add that to the poetry pipeline; I love the not-a-package-manager joke/idea, fully. Maybe its something to develop in parallel and eventually integrate but not sure if makes sense to make a dependency right at the start?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah works for me. let's start with scripting the setup using poe and add some sort of poe run_last_setup_step entry to the setup instructions (... whenever that becomes a thing lol. I need to add a README). We can use poe as a stop-gap and port the setup to napm whenever that's ready. I'll add you to that repo as well. the dependency management gods forgive us.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allright, god help us: napm is officially published on PyPI (pip install napm). take her for a spin and lemme know if it works ok.

@dmarx
Copy link
Owner

dmarx commented Apr 30, 2022

allright, looks like I got CLOOB working well enough. wanna take a stab at finishing off KELIP?

@dmarx
Copy link
Owner

dmarx commented May 1, 2022

...lol nevermind, kelip has a setup.py, can be installed via pyproject.toml. we good.

@dmarx dmarx marked this pull request as ready for review May 1, 2022 03:42
@dmarx dmarx merged commit 633b6af into main May 1, 2022
@dmarx dmarx mentioned this pull request May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants