Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multimodal dataprep #575

Merged
merged 27 commits into from
Aug 29, 2024
Merged

Conversation

tileintel
Copy link
Contributor

Description

This PR introduces multimodal dataprep microservice. This microservice is required for Multimodal RAG on Videos application. This allows users to upload mp4 videos and their associated transcripts (optional) and ingests them into Redis vector store.

This microservice provides 3 different API allowing users to upload and ingest videos for 3 use cases:

  1. when a transcript file (under .vtt format) is available for each video.
  2. when a video has meaningful audio or recognizable speech but its transcript file is not available. In this use case, this microservice will use whisper model to generate the .vtt transcript for the video.
  3. when a video does not have meaningful audio or does not have audio. In this use case, transcript either does not provide any meaningful information or does not exist. Thus, it is preferred to leverage a LVM microservice to summarize the video frames.

This microservice also provides an API for user to list all videos ingested under current index name and an API to delete all videos from local storage and from redis vector store under current index name.

Issues

RFC: https://github.com/opea-project/docs/pull/49/files
Issue: opea-project/GenAIExamples#358

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • [x ] New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

docarray[full]
fastapi
langchain==0.1.12
langchain_benchmarks
langsmith
moviepy
opencv-python
openai-whisper
opentelemetry-api
opentelemetry-exporter-otlp
opentelemetry-sdk
Pillow
prometheus-fastapi-instrumentator
pydantic==2.8.2
python-multipart
redis
transformers
shortuuid
uvicorn
webvtt-py
llava-hf/llava-1.5-7b-hf

Tests

We have provided 1 test for this microservice.

  • tests/test_dataprep_redis_multimodal_langchain.sh: This test downloads a video from here and prepares a transcription for the downloaded video. Then it uses them to test the 3 provided API. This test also tests get_videos and delete_videos APIs.

tileintel and others added 17 commits August 22, 2024 22:18
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Copy link

codecov bot commented Aug 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files with missing lines Coverage Δ
comps/cores/proto/docarray.py 99.09% <100.00%> (+0.08%) ⬆️

... and 1 file with indirect coverage changes

@tileintel
Copy link
Contributor Author

@lvliang-intel. Here is another PR for multimodal data prep with Redis for Multimodal RAG. This PR includes changes in the PR #555 because #555 has not been reviewed and merged yet to GenAIComps main branch.
Would you please help to review this PR?
cc: @kevinintel

@kevinintel kevinintel added this to the v1.0 milestone Aug 29, 2024
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
@XuhuiRen
Copy link
Collaborator

should be better to maintain a single PR, considering merge #555 to this one?
also, the suggestions in that PR should moved into this PR

@tileintel
Copy link
Contributor Author

should be better to maintain a single PR, considering merge #555 to this one? also, the suggestions in that PR should moved into this PR

Thank @XuhuiRen for your suggestion. I have merged the PR #555 to this one. I cannot move the suggestions in PR #555 here. However, I have addressed all of the conversations in #555 into this PR.
I apologize for any inconvenience. I originally thought that separating different microservice into different PRs will help reviewers. But it seems not.

@XuhuiRen
Copy link
Collaborator

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

@XuhuiRen
Copy link
Collaborator

should be better to maintain a single PR, considering merge #555 to this one? also, the suggestions in that PR should moved into this PR

Thank @XuhuiRen for your suggestion. I have merged the PR #555 to this one. I cannot move the suggestions in PR #555 here. However, I have addressed all of the conversations in #555 into this PR. I apologize for any inconvenience. I originally thought that separating different microservice into different PRs will help reviewers. But it seems not.

i recommend to add the code for image embedding to maintain a comprehensive service.

@tileintel
Copy link
Contributor Author

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

You are correct. We haven't submitted the retriever microservice yet. We are waiting for #555 and #575 merged first before we submit the last PRs for retrieval and lvm.
Also please note that we have several GenAIExamples as well. For multimodal rag on videos we will not make use of rerank microservice.

@tileintel tileintel mentioned this pull request Aug 29, 2024
3 tasks
@BaoHuiling
Copy link
Collaborator

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

hi Xuhui. To clarify this, we are contributing different use case for MMRAG, and I’m working on VideoRAGQnA, which is PR #495, #496, #538, #539 and we are going to contribute another PR for dataprep

@XuhuiRen
Copy link
Collaborator

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

hi Xuhui. To clarify this, we are contributing different use case for MMRAG, and I’m working on VideoRAGQnA, which is PR #495, #496, #538, #539 and we are going to contribute another PR for dataprep

but it seems that PR #538 has name conflict with this PR?

@tileintel
Copy link
Contributor Author

tileintel commented Aug 29, 2024

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

hi Xuhui. To clarify this, we are contributing different use case for MMRAG, and I’m working on VideoRAGQnA, which is PR #495, #496, #538, #539 and we are going to contribute another PR for dataprep

but it seems that PR #538 has name conflict with this PR?

Yes. #538 was developed from a previous commit of this PR. I would suggest that we review and merge #575 first, and @BaoHuiling and I will resolve the conflict from #538 right after. I have mentioned this in the related Issue-538

@BaoHuiling
Copy link
Collaborator

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

hi Xuhui. To clarify this, we are contributing different use case for MMRAG, and I’m working on VideoRAGQnA, which is PR #495, #496, #538, #539 and we are going to contribute another PR for dataprep

but it seems that PR #538 has name conflict with this PR?

yes there should be some conflicts, we will align it tomorrow and please hold those PR until we resolve the conflicts. thanks!

@BaoHuiling
Copy link
Collaborator

@srinarayan-srikanthan @s-gobriel please take a look on this PR, check the conflicts and we need to update the code

@BaoHuiling
Copy link
Collaborator

@BaoHuiling @tileintel i did not see the retriever for multimodal. there is only a reranker PR. Is there any PR I missed?

hi Xuhui. To clarify this, we are contributing different use case for MMRAG, and I’m working on VideoRAGQnA, which is PR #495, #496, #538, #539 and we are going to contribute another PR for dataprep

but it seems that PR #538 has name conflict with this PR?

Yes. #538 was developed from a previous commit of this PR. I would suggest that we review and merge #575 first, and @BaoHuiling and I will resolve the conflict from #538 right after. I have mentioned this in the related Issue-538

I'm okay with this. Please do it

@tileintel
Copy link
Contributor Author

@XuhuiRen Given that, @XuhuiRen and @lvliang-intel already approved this PR, would you please help to merge this? This will help us to resolve conflicts for another related PR quicker. Thanks

@lvliang-intel lvliang-intel merged commit 6d4b668 into opea-project:main Aug 29, 2024
16 checks passed
a32543254 pushed a commit to a32543254/GenAIComps that referenced this pull request Sep 3, 2024
* multimodal embedding for MM RAG for videos

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* develop data prep first commit

Signed-off-by: Tiep Le <tiep.le@intel.com>

* develop dataprep microservice for multimodal data

Signed-off-by: Tiep Le <tiep.le@intel.com>

* multimodal langchain for dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* cosmetic

Signed-off-by: Tiep Le <tiep.le@intel.com>

* test for multimodal dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cosmetic update

Signed-off-by: Tiep Le <tiep.le@intel.com>

* remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update API to remove /dataprep from API names and remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update the error message per PR reviewer

Signed-off-by: Tiep Le <tiep.le@intel.com>

---------

Signed-off-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
sharanshirodkar7 pushed a commit to predictionguard/pg-GenAIComps that referenced this pull request Sep 3, 2024
* multimodal embedding for MM RAG for videos

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* develop data prep first commit

Signed-off-by: Tiep Le <tiep.le@intel.com>

* develop dataprep microservice for multimodal data

Signed-off-by: Tiep Le <tiep.le@intel.com>

* multimodal langchain for dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* cosmetic

Signed-off-by: Tiep Le <tiep.le@intel.com>

* test for multimodal dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cosmetic update

Signed-off-by: Tiep Le <tiep.le@intel.com>

* remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update API to remove /dataprep from API names and remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update the error message per PR reviewer

Signed-off-by: Tiep Le <tiep.le@intel.com>

---------

Signed-off-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lvliang-intel added a commit that referenced this pull request Sep 10, 2024
* add rerank with neural speed

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add the code

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add the code

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* fix mismatched response format w/wo streaming guardrails (#568)

* fix mismatched response format w/wo streaming  guardrails

* fix & debug

* fix & rm debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Fix guardrails out handle logics for space linebreak and quote (#571)

* fix mismatched response format w/wo streaming  guardrails

* fix & debug

* fix & rm debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* debug

* debug

* debug

* fix pre-space and linebreak

* fix pre-space and linebreak

* fix single/double quote

* fix single/double quote

* remove debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* BUG FIX: LVM security fix (#572)

* add url validator

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add validation for video_url

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

---------

Signed-off-by: BaoHuiling <huiling.bao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Modify output messages. (#569)

* Reduced output.

Signed-off-by: zepan <ze.pan@intel.com>

* Output the location where the modified Dockerfile file is referenced.

Signed-off-by: zepan <ze.pan@intel.com>

* for test

Signed-off-by: zepan <ze.pan@intel.com>

* Restore test file.

Signed-off-by: zepan <ze.pan@intel.com>

---------

Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* refine logging code. (#559)

* add ut and refine logging code.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update microservice port.

---------

Co-authored-by: root <root@idc708073.jf.intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* adding lancedb to langchain vectorstores (#291)

* adding lancedb to langchain vectorstores

Signed-off-by: sharanshirodkar7 <ssharanshirodkar7@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: sharanshirodkar7 <ssharanshirodkar7@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Refine Dataprep Milvus MS (#570)

Signed-off-by: letonghan <letong.han@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* final version

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* update the readme

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add the sign

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* fix error for pre ci

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add the ut

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* update docker file

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* update CI test log achieve (#577)

Signed-off-by: chensuyue <suyue.chen@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Multimodal dataprep (#575)

* multimodal embedding for MM RAG for videos

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* develop data prep first commit

Signed-off-by: Tiep Le <tiep.le@intel.com>

* develop dataprep microservice for multimodal data

Signed-off-by: Tiep Le <tiep.le@intel.com>

* multimodal langchain for dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update README

Signed-off-by: Tiep Le <tiep.le@intel.com>

* cosmetic

Signed-off-by: Tiep Le <tiep.le@intel.com>

* test for multimodal dataprep

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cosmetic update

Signed-off-by: Tiep Le <tiep.le@intel.com>

* remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update API to remove /dataprep from API names and remove langsmith

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update test

Signed-off-by: Tiep Le <tiep.le@intel.com>

* update the error message per PR reviewer

Signed-off-by: Tiep Le <tiep.le@intel.com>

---------

Signed-off-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add: Pathway vector store and retriever as LangChain component (#342)

* nb

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* init changes

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* docker

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* example data

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* docs(readme): update, add commands

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: formatting, data sources

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* docs(readme): update instructions, add comments

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: rm unused parts

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: image name, compose env vars

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: rm unused part

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: logging name

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: env var

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: rename pw docker

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* docs(readme): update input sources

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* nb

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* init changes

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: formatting, data sources

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* docs(readme): update instructions, add comments

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: rm unused part

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* fix: rename pw docker

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat: mv vector store, naming, clarify instructions, improve ingestion components

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tests: add pw retriever test
fix: update docker to include libmagic

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* implement suggestions from review, entrypoint, reqs, comments, https_proxy.

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix: update docker tags in test and readme

Signed-off-by: Berke <berkecanrizai1@gmail.com>

* tests: add separate pathway vectorstore test

Signed-off-by: Berke <berkecanrizai1@gmail.com>

---------

Signed-off-by: Berke <berkecanrizai1@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sihan Chen <39623753+Spycsh@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Add local Rerank microservice for VideoRAGQnA (#496)

* initial commit

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* save

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* add readme, test script, fix bug

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* update video URL

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* use default

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update core dependency

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* use p 5000

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* use 5037

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* update ctnr name

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* remove langsmith

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add rerank algo desc in readme

Signed-off-by: BaoHuiling <huiling.bao@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: BaoHuiling <huiling.bao@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Add Scan Container. (#560)

Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* fix SearchedMultimodalDoc in docarray (#583)

Signed-off-by: BaoHuiling <huiling.bao@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* update image build yaml (#529)

Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add microservice for intent detection (#131)

* add microservice for intent detection

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update license copyright

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

* add ut

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

* refine

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update folder

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix test

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>

---------

Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Make the scanning method optional. (#580)

Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add code owners (#586)

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* remove revision for tei (#584)

Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* Bug fix (#591)

* Check if the document exists.

Signed-off-by: zepan <ze.pan@intel.com>

* Add flag output.

Signed-off-by: zepan <ze.pan@intel.com>

* Modify nginx readme.

Signed-off-by: zepan <ze.pan@intel.com>

* Modify document detection logic

Signed-off-by: zepan <ze.pan@intel.com>

---------

Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* fix ut issue

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* merge the main

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* align with new pipeline

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* align with newest pipeline

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* upload code

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* update the ut

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add docker path

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

* add the docker path

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>

---------

Signed-off-by: Dong, Bo1 <bo1.dong@intel.com>
Signed-off-by: BaoHuiling <huiling.bao@intel.com>
Signed-off-by: zepan <ze.pan@intel.com>
Signed-off-by: sharanshirodkar7 <ssharanshirodkar7@gmail.com>
Signed-off-by: letonghan <letong.han@intel.com>
Signed-off-by: chensuyue <suyue.chen@intel.com>
Signed-off-by: Tiep Le <tiep.le@intel.com>
Signed-off-by: Berke <berkecanrizai1@gmail.com>
Signed-off-by: Liangyx2 <yuxiang.liang@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Sihan Chen <39623753+Spycsh@users.noreply.github.com>
Co-authored-by: Huiling Bao <huiling.bao@intel.com>
Co-authored-by: ZePan110 <ze.pan@intel.com>
Co-authored-by: lkk <33276950+lkk12014402@users.noreply.github.com>
Co-authored-by: root <root@idc708073.jf.intel.com>
Co-authored-by: Sharan Shirodkar <91109427+sharanshirodkar7@users.noreply.github.com>
Co-authored-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com>
Co-authored-by: Liangyx2 <yuxiang.liang@intel.com>
Co-authored-by: kevinintel <hanwen.chang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants