-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CLOOB and KELIP #14
Conversation
Forgot |
src/mmc/loaders/cloobloader.py
Outdated
""" | ||
Returns the MMC associated with this loader. | ||
""" | ||
from cloob_training import model_pt, pretrained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to push a separate PR to kat's repo to make cloob_training
installable first, yeah? Then we can just add the cloob_training
installation to the pyproject.toml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that from Kat's repo, but do you feel this is a scalable approach?
I just wonder whether it should be a MMC requirement that every CLIP-like module we support is installable or if we should have support a custom way way to load it, maybe with cloning repos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a really good point. maybe we could even set it up in such a way that it could check to see if something we want is already "installed" on the PATH variable? feels sorta like an accident waiting happen. how about this: the logic for cloning the repo and all that has to be in a loading class like this.
here's how I want this thing ultimately to work. imagine I have never even touched a pytorch model in my life. I just heard about CLIP and I want to play with it.
-
I install this repo. right now, I think this looks something like
git clone https://github.com/dmarx/Multi-Modal-Comparators/;
cd Multi-Modal-Comparators
pip install poetry
poetry build
pip install dist/mmc_*.whl -
I import a generic api interface and blindly play with it
import mmc
avilable = mmc.available_models()
perceptor = mmc.load(available[0])
x,y = perceptor.encode_text('foo bar'), perceptor.encode_image('path/to/file.png')
score = perceptor.compare(x,y)
my goal here is for CLOOB and KELIP to be available to the user immediately following the setup of this library.
I'm totally open to having some sort of standard pattern we can wrap for dependencies that aren't trivially installable. But I want abstracting away the setup logic and abstracting away esoteric usage api's to be the value this library offers.
Here's an idea: I've already been using a hidden .cache folder for checkpoints, let's designate a subfolder for dependencies that need to be cloned but can't be installed. there's a tool called poe that integrates with poetry as a task runner which could be useful to look at.
does that make sense? I hope I'm not trying to boil the ocean here, am I? This would definitely be more scalable but also feels unstable.
at the very least, we should add like a shell script or something the user can execute as part of the installation that does the git clone, PATH modification, etc. steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created this repo sort of as a joke to myself, but maybe we should turn it into a real thing? all that business above with taking care of cloning and all that into a subfolder then playing with the path variables etc: abstract all of that away into its own piece. easy way to maintain and leverage the "best practices" for ... this particular brand of "duct tape and bubblegum"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You raised very good points and I think those make sense. Although at the end we may want to go back to the original issue and see if what we are doing is more work per package than making a fork/PR for each tool we want to support we might go back to that at the end. But I don't think it is.
I checked https://github.com/nat-n/poethepoet and I feel it makes sense to me. We could script each import with git clone and path and add that to the poetry pipeline; I love the not-a-package-manager joke/idea, fully. Maybe its something to develop in parallel and eventually integrate but not sure if makes sense to make a dependency right at the start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah works for me. let's start with scripting the setup using poe and add some sort of poe run_last_setup_step
entry to the setup instructions (... whenever that becomes a thing lol. I need to add a README). We can use poe as a stop-gap and port the setup to napm whenever that's ready. I'll add you to that repo as well. the dependency management gods forgive us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allright, god help us: napm is officially published on PyPI (pip install napm
). take her for a spin and lemme know if it works ok.
allright, looks like I got CLOOB working well enough. wanna take a stab at finishing off KELIP? |
...lol nevermind, kelip has a setup.py, can be installed via pyproject.toml. we good. |
Adding CLOOB and KELIP loaders