The Retrieval Augmented Generation Assistant is an intelligent chatbot that enables users to ask questions about information that was not available in the training dataset of the Large Language Model being used.
To achieve so, the user just has to introduce any URL (it works both with webpages and PDFs) and Retrieval Augmented Generation Assistant will be able to answer questions about that content:
Note: This project was developed during the HackUPC 2024. See it on DevPost
-
Specify Data Source Input: The chatbot includes an input to specify the desired data source. Users can provide the URL of the webpage or PDF they are interested in, and the chatbot will retrieve relevant information from the webpage in real-time.
-
Information Query Input: The chatbot offers a second input where users can ask questions or request specific information about the loaded webpage or PDF. Users can inquire about any information available in the webpage, such as its title, meta description, keywords, headings, images, links, and more.
To use the Webpage Information Chatbot, follow these steps:
- Provide the URL of the webpage or PDF you want to analyze in the designated input field.
- Ask questions or request specific information about the webpage or PDF in the chat interface.
- Receive intelligent responses from the chatbot, which will provide relevant information extracted from the webpage or PDF.
Welcome to the Retrieval-Augmented Generation Assistant API documentation. This API is designed to provide users with functionalities for retrieving information from URLs and generating responses to user prompts or questions based on the retrieved content.
-
Description: Fetches content from a given URL (can be html or pdf).
-
Method:
POST
-
Endpoint:
/get_url
-
Request Parameters:
url
(string): The URL from which to retrieve content.
-
Success Responses:
-
Error Responses:
-
Description: Generates a response to a user question based on the content of a specified URL.
-
Method:
POST
-
Endpoint:
/ask
-
Request Parameters:
question
(string): The question posed by the user.
-
Success Responses:
To use the Retrieval-Augmented Generation Assistant API, make HTTP POST requests to the appropriate endpoints with the required parameters.
- Ensure that the URLs provided are accessible and contain valid content.
- The API may return errors for invalid or inaccessible URLs.
Retrieval Augmented Generation Assistant consists of two decoupled parts: a frontend interface and backend API, as you can see in the structure of this repository.
The frontend is a simple chat interface built using Svelte.js and Vite.js. It manages the logic of calling the correct API endpoints and a basic state management to show the user a loading spinner icon between a question is submitted and the message with the API response is rendered in the chat interface. It also includes a button to toggle between dark mode and white mode.
The backend is built using the Python Framework FastAPI to expose the API endpoints specified before. It also requires an instance of the Iris Vector Database by InterSystems.
- To scrape the webpages or PDFs, we are using Beautiful Soup.
- To split the text into chunks we use RecursiveCharacterTextSplitter by Langchain.
- To generate the embeddings we use the OpenAI text embedings model.
- To store the embedings and perform similarity_search_with_relevance_scores, we use the Iris Vector Database by InterSystems.
- To interact with the user, we use OpenAI gpt-3.5 turbo Large Language Model.
- To Deploy the backend to a VPS we used Docker.