[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

wanliAlex · 2022-12-20T08:30:56Z

What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
features
What is the current behavior? (You can also link to an open issue here)
no onnx model for clip
What is the new behavior (if this is a feature change)?
we have onnx model for clip model
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
no
Have unit tests been run against this PR? (Has there also been any additional testing?)
not yet
Related Python client changes (link commit/PR here)
no
Related documentation changes (link commit/PR here)
not yet
Other information:
the model passes the performance test on a 50k image dataset
the detailed performance can be checked at link
check this site for available models on huggingface
check this repo for onnx model generation
Please check if the PR fulfills these requirements

The commit message follows our guidelines
Tests for the changes have been added (for bug fixes/features)
Docs have been added / updated (for bug fixes / features)

pandu-k

The main piece of feedback is that onnx_clip_utils seems to be an outdated copy of clip_utils. Can you make ONNX_CLIP a subclass of CLIP? Having two copies of the same code isn't very DRY - unless you want to decouple the APIs for these two types of models.

We can already see issues with copying the code: clip_utils has already had an update with better error message, which hasn't propagated to your code: https://github.com/marqo-ai/marqo/blob/mainline/src/marqo/s2_inference/clip_utils.py#L62

src/marqo/s2_inference/onnx_clip_utils.py

…est accuracy.

wanliAlex · 2022-12-29T02:32:40Z

Yes, the end to end indexing and searching are tested on a 50k image dataset.

pandu-k

Thanks for this PR @wanliAlex !

…rence (#245) * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * cleaning * add test for onnx_clip * make sure onnx-16 model still use float32 for textual inference for best accuracy. * make sure onnx-16 model still use float32 for textual inference for best accuracy. * we merge the hf models. * update id to _device_id

* add APIs in marqo * test new model cache key * cleaning * Add todo * add multi-gpu support * add multi-gpu support * space adding * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * adding cpu usage, RAM usage api * revert back model cache key * revert back model cache key * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add test_eject_model test * add unit test * add unit test * add unit test * add unit test * add unit test * add unit test * add unit test * test cuda only when cuda is available * format update * format update * format update * test_edge_case_cpu fix (remove cuda memory test) * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * add separators for readable information in model cache key * adding test * adding test * adding test * adding test * adding test * adding test * update id to _device_id * [Onnx clip]Adding the clip_onnx to our avaible models for faster inference (#245) * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * onnx32/openai/ViT-L/14 * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * add a timer * cleaning * add test for onnx_clip * make sure onnx-16 model still use float32 for textual inference for best accuracy. * make sure onnx-16 model still use float32 for textual inference for best accuracy. * we merge the hf models. * update id to _device_id * mainline merge * mainline merge * mainline merge * reduce a model for testing stability * reduce a model for testing stability * update * update * update * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * add test for generic model * revision * revision

wanliAlex added 20 commits December 20, 2022 11:26

onnx32/openai/ViT-L/14

f431b2b

onnx32/openai/ViT-L/14

231ab9d

onnx32/openai/ViT-L/14

e0954f1

onnx32/openai/ViT-L/14

9e43cee

add a timer

9c63e89

add a timer

6823b85

add a timer

e02fc6b

add a timer

196c96b

add a timer

2443919

add a timer

3c56c67

add a timer

7235a98

add a timer

736db8b

add a timer

f0259ca

add a timer

2e63a34

add a timer

2ee56e3

add a timer

cc6d231

add a timer

8fa3bd6

add a timer

b269520

add a timer

720c577

cleaning

266a801

pandu-k requested changes Dec 20, 2022

View reviewed changes

src/marqo/s2_inference/onnx_clip_utils.py Outdated Show resolved Hide resolved

wanliAlex added 4 commits December 22, 2022 16:59

add test for onnx_clip

d3341a1

make sure onnx-16 model still use float32 for textual inference for b…

246b083

…est accuracy.

make sure onnx-16 model still use float32 for textual inference for b…

719e367

…est accuracy.

we merge the hf models.

1cb9d33

wanliAlex changed the title ~~[Onnx clip draft] Adding the clip_onnx to our avaible models for faster inference~~ [Onnx clip]Adding the clip_onnx to our avaible models for faster inference Dec 22, 2022

wanliAlex requested a review from pandu-k December 22, 2022 10:21

pandu-k approved these changes Dec 29, 2022

View reviewed changes

pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive

pandu-k had a problem deploying to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Failure

pandu-k had a problem deploying to marqo-test-suite December 29, 2022 03:19 — with GitHub Actions Error

pandu-k had a problem deploying to marqo-test-suite December 29, 2022 03:46 — with GitHub Actions Failure

update id to _device_id

89968f2

pandu-k temporarily deployed to marqo-test-suite December 29, 2022 04:35 — with GitHub Actions Inactive

pandu-k temporarily deployed to marqo-test-suite December 29, 2022 06:11 — with GitHub Actions Inactive

pandu-k merged commit 8965011 into mainline Dec 29, 2022

pandu-k deleted the onnx-clip branch December 29, 2022 06:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

wanliAlex commented Dec 20, 2022 •

edited

Loading

pandu-k left a comment

wanliAlex commented Dec 29, 2022

pandu-k left a comment

[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

Conversation

wanliAlex commented Dec 20, 2022 • edited Loading

pandu-k left a comment

Choose a reason for hiding this comment

wanliAlex commented Dec 29, 2022

pandu-k left a comment

Choose a reason for hiding this comment

wanliAlex commented Dec 20, 2022 •

edited

Loading