RAG app example #118

heyjustinai · 2024-11-18T22:36:11Z

What does this PR do?

Creating a E2E RAG example that is able to do retrieval on documents and answer user questions. Components included:

Inference (with llama-stack)
Memory (with llama-stack)
Agent (with llama-stack)
Frontend (with Gradio)

Feature/Issue validation/testing/test plan

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

examples/agents/rag_with_memory_bank.py

examples/E2E-RAG-App/01_ingest_files.py

examples/E2E-RAG-App/README.md

examples/E2E-RAG-App/gradio_interface.py

requirements.txt

examples/E2E-RAG-App/data/eval/eval.py

examples/DocQA/docker/chroma_start.sh

examples/DocQA/docker/llama_stack_start.sh

ashwinb · 2024-12-03T18:45:39Z

examples/DocQA/app.py

+USE_GPU = os.getenv("USE_GPU", False)
+MODEL_NAME = os.getenv("MODEL_NAME", "meta-llama/Llama-3.2-1B-Instruct")
+# if use_gpu, then the documents will be processed to output folder
+DOCS_DIR = "/root/rag_data/output" if USE_GPU else "/root/rag_data/"


why is this different depending on whether we use a GPU or not?

Our ingest pipeline will take the /root/rag_data/ data folder and save results to /root/rag_data/output when using GPU. Otherwise the data folder is just /root/rag_data/

@wukaixingxp still don't know why the output folder is different when using GPU vs. not? if you aren't using GPU, where does the ingest pipeline save the results?

ingest pipeline is just for image: basically it takes everything in /root/rag_data, find out if there is any image embed in the PDF, split the image and use 11B to generate image description, then it will save the original text with image description into /root/rag_data folder so that everything is now text data, ready to be used by 3B RAG agent. Running 11B on CPU is sooo slow and we have not enable this in the current stage. We believe this can be a P1 feature to have. Now we only support text data and will ignore embedded images, thus just take everything in the /root/rag_data folder

examples/DocQA/app.py

examples/DocQA/docker/llama_stack_start.sh

ashwinb · 2024-12-03T18:50:42Z

examples/DocQA/docker/llama_stack_start.sh

+# Print a message indicating the start of llama-stack server
+echo "starting the llama-stack server"
+# Run llama-stack server with specified config and disable ipv6
+python -m llama_stack.distribution.server.server --yaml-config /root/my-run.yaml --disable-ipv6&


I think this should be re-architected a bit. Why isn't the docker running two services for this purpose:

one for running the llama stack server

one for running the RAG app? that entrypoint can just be python /root/DocQA/app.py

I was hoping to use LlamaStackDirectClient , but I am not sure if LlamaStackDirectClient supports MemoryBank connection to ChromaDB.

examples/DocQA/docker/run_RAG.sh

ashwinb · 2024-12-04T00:35:12Z

examples/DocQA/docker/llama_stack_start.sh

+echo "-----starting to llama-stack docker now---------"
+pip install gradio
+
+if [ "$USE_GPU_FOR_DOC_INGESTION" = true ]; then


looks like if that variable is false, we just don't ingest at all?

Yes, ingest pipeline is just for image: basically it takes a PDF, find out if there is any image and use 11B to generate image description, then it will save the original text with image description into output folder so that everything is now text data, ready to be used by 3B RAG agent. Running 11B on CPU is sooo slow and we have not enable this in the current stage. We believe this can be a P1 feature to have.

ashwinb · 2024-12-04T17:46:45Z

Can you rebase and update requirements.txt so we can merge it in

This reverts commit 7b00a1c.

… a ragservice template

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 18, 2024

heyjustinai marked this pull request as ready for review November 20, 2024 21:11

heyjustinai requested review from ashwinb, dltn, hardikjshah, raghotham and yanxi0830 as code owners November 20, 2024 21:11

heyjustinai changed the title ~~[WIP] Rag app~~ RAG app example Nov 20, 2024

ashwinb requested changes Nov 21, 2024

View reviewed changes

yanxi0830 reviewed Nov 21, 2024

View reviewed changes

examples/E2E-RAG-App/data/eval/eval.py Outdated Show resolved Hide resolved

heyjustinai requested a review from ashwinb December 3, 2024 06:58

ashwinb requested changes Dec 3, 2024

View reviewed changes

ashwinb reviewed Dec 4, 2024

View reviewed changes

heyjustinai requested a review from ashwinb December 4, 2024 17:29

ashwinb approved these changes Dec 4, 2024

View reviewed changes

init27 and others added 14 commits December 4, 2024 09:48

Create Readme.MD

c267fb0

rag_main works for single-turn

7225fc2

multi-turn support

0032d41

added persistent memory

c0e7b88

removed faiss

c1bc695

included external chromadb

d1d954d

added query, implement cprint

a3426f6

Create ingestion_script.py

e856916

Update ingestion_script.py

27d34c7

Update ingestion_script.py

529b22e

Update ingestion_script.py

fc78f70

Update ingestion_script.py

3140c7c

Update README.md

e42dac9

fix doc retrieval issue, inclu requirement.txt

3cc1cb8

init27 and others added 25 commits December 4, 2024 09:53

Update gradio_interface.py

88f1d63

Revert "Update gradio_interface.py"

7322d7c

This reverts commit 7b00a1c.

wip-eval: trying to get it work with current stack

f28d921

stop ingest when there is output folder

545f942

Update README.md

dda9b78

modified eval for 0.0.53

7ae6645

undo changes on example.agent.rag_with_memory_bank

2e269b5

changes made before PR review, stable branch

c6380f8

removed default value for memory tool, modified gitignore and created…

6bc2164

… a ragservice template

changes to run locally v0.55

7f9c8c2

change handling of eventlog and streaming

51fdd3d

move requirements to proj dir

92bc703

removed unnecesary requirement.txt from root

1939b72

updated eval

ed20291

include readme for eval

633c50c

remove external chroma, using only memorybank

30619b6

changed to app.py and add GPU flag

0bc1257

folder restructure, update scripts, change readme

70e5678

made changes to UI, gradio, seperating retrievecontext and inference

8d699f2

change readme and workflow diagram to simple version

ceb10a8

minor changes in app.py

504bf44

change app.py, change docker compose to use ollama 0.56, updated readme

54a1919

added model table

4dde367

add GPU compose.yaml

c95d48e

changed var naming and chromaDB docker

860febe

heyjustinai force-pushed the rag-app branch from beab847 to 860febe Compare December 4, 2024 17:53

heyjustinai added 2 commits December 4, 2024 09:59

update requirement.txt

8ad89ad

Merge branch 'main' into rag-app

80a86f3

ashwinb merged commit 64ee0f0 into main Dec 4, 2024
1 check passed

ashwinb deleted the rag-app branch December 4, 2024 23:41

RAG app example #118

RAG app example #118

Uh oh!

Conversation

heyjustinai commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Feature/Issue validation/testing/test plan

Before submitting

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashwinb commented Dec 4, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

heyjustinai commented Nov 18, 2024 •

edited

Loading