-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main piece of feedback is that onnx_clip_utils seems to be an outdated copy of clip_utils. Can you make ONNX_CLIP a subclass of CLIP? Having two copies of the same code isn't very DRY - unless you want to decouple the APIs for these two types of models.
We can already see issues with copying the code: clip_utils has already had an update with better error message, which hasn't propagated to your code: https://github.com/marqo-ai/marqo/blob/mainline/src/marqo/s2_inference/clip_utils.py#L62
Yes, the end to end indexing and searching are tested on a 50k image dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @wanliAlex !
…rence (#245) * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * cleaning * add test for onnx_clip * make sure onnx-16 model still use float32 for textual inference for best accuracy. * make sure onnx-16 model still use float32 for textual inference for best accuracy. * we merge the hf models. * update id to _device_id
* add APIs in marqo * test new model cache key * cleaning * Add todo * add multi-gpu support * add multi-gpu support * space adding * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * revert back model cache key * revert back model cache key * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add unit test * add unit test * add unit test * add unit test * add unit test * add unit test * add unit test * test cuda only when cuda is available * format update * format update * format update * test_edge_case_cpu fix (remove cuda memory test) * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * adding test * adding test * adding test * adding test * adding test * adding test * update id to _device_id * [Onnx clip]Adding the clip_onnx to our avaible models for faster inference (#245) * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * cleaning * add test for onnx_clip * make sure onnx-16 model still use float32 for textual inference for best accuracy. * make sure onnx-16 model still use float32 for textual inference for best accuracy. * we merge the hf models. * update id to _device_id * mainline merge * mainline merge * mainline merge * reduce a model for testing stability * reduce a model for testing stability * update * update * update * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * revision * revision
What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
features
What is the current behavior? (You can also link to an open issue here)
no onnx model for clip
What is the new behavior (if this is a feature change)?
we have onnx model for clip model
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
no
Have unit tests been run against this PR? (Has there also been any additional testing?)
not yet
Related Python client changes (link commit/PR here)
no
Related documentation changes (link commit/PR here)
not yet
Other information:
the model passes the performance test on a 50k image dataset
the detailed performance can be checked at link
check this site for available models on huggingface
check this repo for onnx model generation
Please check if the PR fulfills these requirements