Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Onnx clip]Adding the clip_onnx to our avaible models for faster inference #245

Merged
merged 25 commits into from
Dec 29, 2022

Conversation

wanliAlex
Copy link
Collaborator

@wanliAlex wanliAlex commented Dec 20, 2022

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    features

  • What is the current behavior? (You can also link to an open issue here)
    no onnx model for clip

  • What is the new behavior (if this is a feature change)?
    we have onnx model for clip model

  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
    no

  • Have unit tests been run against this PR? (Has there also been any additional testing?)
    not yet

  • Related Python client changes (link commit/PR here)
    no

  • Related documentation changes (link commit/PR here)
    not yet

  • Other information:
    the model passes the performance test on a 50k image dataset
    the detailed performance can be checked at link
    check this site for available models on huggingface
    check this repo for onnx model generation

  • Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes/features)
  • Docs have been added / updated (for bug fixes / features)

Copy link
Collaborator

@pandu-k pandu-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main piece of feedback is that onnx_clip_utils seems to be an outdated copy of clip_utils. Can you make ONNX_CLIP a subclass of CLIP? Having two copies of the same code isn't very DRY - unless you want to decouple the APIs for these two types of models.

We can already see issues with copying the code: clip_utils has already had an update with better error message, which hasn't propagated to your code: https://github.com/marqo-ai/marqo/blob/mainline/src/marqo/s2_inference/clip_utils.py#L62

src/marqo/s2_inference/onnx_clip_utils.py Outdated Show resolved Hide resolved
@wanliAlex wanliAlex changed the title [Onnx clip draft] Adding the clip_onnx to our avaible models for faster inference [Onnx clip]Adding the clip_onnx to our avaible models for faster inference Dec 22, 2022
@wanliAlex
Copy link
Collaborator Author

Yes, the end to end indexing and searching are tested on a 50k image dataset.

Copy link
Collaborator

@pandu-k pandu-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @wanliAlex !

@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 02:39 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 04:35 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 06:11 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 06:11 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 06:11 — with GitHub Actions Inactive
@pandu-k pandu-k temporarily deployed to marqo-test-suite December 29, 2022 06:11 — with GitHub Actions Inactive
@pandu-k pandu-k merged commit 8965011 into mainline Dec 29, 2022
@pandu-k pandu-k deleted the onnx-clip branch December 29, 2022 06:50
wanliAlex added a commit that referenced this pull request Dec 29, 2022
…rence (#245)

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* cleaning

* add test for onnx_clip

* make sure onnx-16 model still use float32 for textual inference for best accuracy.

* make sure onnx-16 model still use float32 for textual inference for best accuracy.

* we merge the hf models.

* update id to _device_id
pandu-k pushed a commit that referenced this pull request Jan 6, 2023
* add APIs in marqo

* test new model cache key

* cleaning

* Add todo

* add multi-gpu support

* add multi-gpu support

* space adding

* adding cpu usage, RAM usage api

* adding cpu usage, RAM usage api

* adding cpu usage, RAM usage api

* adding cpu usage, RAM usage api

* revert back model cache key

* revert back model cache key

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add test_eject_model test

* add unit test

* add unit test

* add unit test

* add unit test

* add unit test

* add unit test

* add unit test

* test cuda only when cuda is available

* format update

* format update

* format update

* test_edge_case_cpu fix (remove cuda memory test)

* add separators for readable information in model cache key

* add separators for readable information in model cache key

* add separators for readable information in model cache key

* add separators for readable information in model cache key

* add separators for readable information in model cache key

* adding test

* adding test

* adding test

* adding test

* adding test

* adding test

* update id to _device_id

* [Onnx clip]Adding the clip_onnx to our avaible models for faster inference (#245)

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* onnx32/openai/ViT-L/14

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* add a timer

* cleaning

* add test for onnx_clip

* make sure onnx-16 model still use float32 for textual inference for best accuracy.

* make sure onnx-16 model still use float32 for textual inference for best accuracy.

* we merge the hf models.

* update id to _device_id

* mainline merge

* mainline merge

* mainline merge

* reduce a model for testing stability

* reduce a model for testing stability

* update

* update

* update

* add test for generic model

* add test for generic model

* add test for generic model

* add test for generic model

* add test for generic model

* add test for generic model

* add test for generic model

* revision

* revision
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants