-
Notifications
You must be signed in to change notification settings - Fork 1.2k
docs: Update get_started notebook and RAG page (WIP) #3595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
docs/getting_started_v0_3_0.ipynb
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update the existing getting started guide instead? I think we're better off having one canonical notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i started exploring redoing this in the spirit of the existing quickstart, which aims to show value with basic agentic rag in as few lines of possible.
NOTE: As a brief aside, we used to be able to embed the demo script in the markdown with sphinx but it appears that is no longer true. I'm sure there's a way we could embed the script and compile it or something but that's a todo for another time.
Here's what the OpenAI Responses version of our quickstart looks like:
import io, requests
from openai import OpenAI
url="https://www.paulgraham.com/greatwork.html"
client = OpenAI()
tmp_vector_store = client.vector_stores.create(name="temp_store")
response = requests.get(url)
pseudo_file = io.BytesIO(str(response.content).encode('utf-8'))
file_id = client.files.create(file=(url, pseudo_file, "text/html"), purpose="assistants").id
client.vector_stores.files.create(vector_store_id=tmp_vector_store.id, file_id=file_id)
resp = client.responses.create(
model="gpt-4o-mini",
input="How do you do great work?",
tools=[{"type": "file_search", "vector_store_ids": [tmp_vector_store.id]}]
)
print(resp)
It's quite short and to the point and also highlights citations—I think it shows a lot of value in a little bit of code.
IMO we should highlight this quickstart with Responses but also show it with Chat Completions using the Vector Store Search API with traditional RAG (even though that's not agentic and is more code).
It would then follow, in my opinion, that the Quickstart would lead to the Detailed Tutorial, which could then lead to the RAG Deep Dive that would outline more sophisticated examples (Probably ones that expose some of the other parameters within the VectorStores.Files.create
and VectorStores.Search
APIs.
But that's just a thought on how this could flow together to be maybe fluid.
What does this PR do?
Update get_started notebook to replace old apis with new apis and similarly for the RAG page
Test Plan