Merge remote-tracking branch 'upstream/main'

sirocco-ventures · Oct 8, 2024 · 594ab29 · 594ab29
2 parents 2c107f2 + 41f3db5
commit 594ab29
Show file tree

Hide file tree

Showing 64 changed files with 17,460 additions and 163 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1,14 @@
+./ui
+./github
+.gitignore
+poetry.lock
+README.md
+CODE_OF_CONDUCT.md
+pyproject.toml
+CONTRIBUTING.md
+.flake8
+LICENSE
+setup.py
+Makefile
+.pre-commit-config.yaml
+.env.example
diff --git a/.github/workflows/static.yml b/.github/workflows/static.yml
@@ -0,0 +1,59 @@
+# Simple workflow for deploying static content to GitHub Pages
+name: Deploy static content to Pages
+
+on:
+  # Runs on pushes targeting the default branch
+  push:
+    branches: ["main"]
+
+  # Allows you to run this workflow manually from the Actions tab
+  workflow_dispatch:
+
+# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+# Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued.
+# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete.
+concurrency:
+  group: "pages"
+  cancel-in-progress: false
+
+jobs:
+  buid_and_deploy:
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          sparse-checkout: 'documents'
+          sparse-checkout-cone-mode: false
+      - name: Setup Pages
+        uses: actions/configure-pages@v5
+      # Set up Node.js and install dependencies with npm
+      - name: Set up Node.js
+        uses: actions/setup-node@v3
+        with:
+          node-version: 18.0.x
+
+      - name: Install docusaurus
+        run: npm install
+        working-directory: documents
+      - name: Build documents
+        run: npm run build
+        working-directory: documents
+
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: documents/build
+
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,33 @@
+# Stage 1: Builder
+FROM python:3.11 AS builder
+
+# Improve performance and prevent generation of .pyc files
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+
+# Set the working directory in the container
+WORKDIR /app
+
+# Copy the requirements file into the container
+COPY requirements.txt .
+
+# Create and activate a virtual environment, then install the dependencies
+RUN pip install virtualenv && \
+    virtualenv /opt/venv && \
+    . /opt/venv/bin/activate && \
+    pip install --no-cache-dir -r requirements.txt
+
+# Stage 2: Deployer
+FROM python:3.11-slim AS deployer
+
+# Copy the virtual environment from the builder stage
+COPY --from=builder /opt/venv /opt/venv
+ENV PATH="/opt/venv/bin:$PATH"
+
+# Set the working directory
+WORKDIR /app
+
+# Copy the rest of the application code
+COPY . .
+
+EXPOSE 8001
diff --git a/README.md b/README.md
@@ -19,7 +19,8 @@ The project is in its early stages, and we are working on adding more capabiliti
 • Current focus: We are currently focused on making it easy to build RAG Application. Going forward we will be focusing on maintaince and monitoring of the RAG system as well cosidering how to help these applications to take from pilots to production.
 
 ### RAGGENIE Demo
-[![Demo Video](https://img.youtube.com/vi/8h4bqqs5S3U/0.jpg)](https://www.youtube.com/watch?v=8h4bqqs5S3U)
+1. Demo with database -    [![Demo with database](https://img.youtube.com/vi/7wBO6g4rj3U/0.jpg)](https://www.youtube.com/watch?v=7wBO6g4rj3U)
+2. Demo with website data -    [![Demo with website data](https://img.youtube.com/vi/8h4bqqs5S3U/0.jpg)](https://www.youtube.com/watch?v=8h4bqqs5S3U)
 
 ## 🌎 Communities
 
@@ -74,6 +75,9 @@ This component will help you embed the chat widget into your UI with JavaScript.
 ## 🛠️ Getting Started
 You can use RAGGENIE to create your own conversational chat feature for your application either by integrating it as a chatbot or by embedding it into your application. You can also use it to create different chatbots for different internal teams by tuning each chatbot for different tasks and using different knowledge base for different usecases.
 
+### How to run Video
+[![Setting up RAGGENIE](https://img.youtube.com/vi/LfCqiToOCvI/0.jpg)](https://www.youtube.com/watch?v=LfCqiToOCvI)
+
 ### 📄 Documentation
 Comprehensive documentation is available to help you get the most out of RAGGENIE. The full documentation for RAGGENIE can be found [here]()
 

diff --git a/documents/.gitignore b/documents/.gitignore
@@ -0,0 +1,20 @@
+# Dependencies
+/node_modules
+
+# Production
+/build
+
+# Generated files
+.docusaurus
+.cache-loader
+
+# Misc
+.DS_Store
+.env.local
+.env.development.local
+.env.test.local
+.env.production.local
+
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
diff --git a/documents/README.md b/documents/README.md
@@ -0,0 +1,41 @@
+# Website
+
+This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
+
+### Installation
+
+```
+$ yarn
+```
+
+### Local Development
+
+```
+$ yarn start
+```
+
+This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
+
+### Build
+
+```
+$ yarn build
+```
+
+This command generates static content into the `build` directory and can be served using any static contents hosting service.
+
+### Deployment
+
+Using SSH:
+
+```
+$ USE_SSH=true yarn deploy
+```
+
+Not using SSH:
+
+```
+$ GIT_USER=<Your GitHub username> yarn deploy
+```
+
+If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
diff --git a/documents/babel.config.js b/documents/babel.config.js
@@ -0,0 +1,3 @@
+module.exports = {
+  presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
+};
diff --git a/documents/docs/Configuring agents.md b/documents/docs/Configuring agents.md
@@ -0,0 +1,5 @@
+---
+sidebar_position: 7
+---
+
+# Configuring agents
diff --git a/documents/docs/Connectors/Airtable.md b/documents/docs/Connectors/Airtable.md
@@ -0,0 +1,17 @@
+---
+sidebar_position: 2
+---
+
+# Airtable Plugin
+
+### Plugin name
+The name of the plugin is used to differentiare between different connected plugins. These would be used for LLM calls during intent extraction.
+
+### Plugin Description
+A brief description of data in the plugin. This is used during LLM calls and may affect the quality of LLM response thus make sure that it is descriptive enough for good LLM output while being short enough to reduce LLM cost.
+
+### Airtable token
+The Airtable Token is an API key used to authenticate and access data from Airtable. Airtable integration allows the plugin to retrieve structured datasets and tables that will be used during query generation.
+
+### Airtable workspace id
+The Airtable Workspace ID specifies which workspace within your Airtable account the plugin will connect to. A workspace can contain multiple bases, and identifying the correct workspace is important for retrieving the right data.
diff --git a/documents/docs/Connectors/Bigquery.md b/documents/docs/Connectors/Bigquery.md
@@ -0,0 +1,17 @@
+---
+sidebar_position: 3
+---
+
+# Bigquery Plugin
+
+### Plugin name
+The name of the plugin is used to differentiare between different connected plugins. These would be used for LLM calls during intent extraction.
+
+### Plugin Description
+A brief description of data in the plugin. This is used during LLM calls and may affect the quality of LLM response thus make sure that it is descriptive enough for good LLM output while being short enough to reduce LLM cost.
+
+### Service account JSON
+The Service Account JSON contains authentication credentials that allow your RAG application to access Google BigQuery securely. This file is essential for granting the necessary permissions to query data stored in BigQuery.
+
+### Project id
+The Project ID refers to the specific Google Cloud project where your BigQuery datasets reside. Each BigQuery query is associated with a project, and the project ID is used to identify which datasets the plugin should access.
diff --git a/documents/docs/Connectors/Connectors.md b/documents/docs/Connectors/Connectors.md
@@ -0,0 +1,18 @@
+---
+sidebar_position: 4
+---
+
+# Connectors/pluggins
+Different components in your LLM app can be inserted using plugins.
+
+## Data Sources
+Currently these are the datasource plugins that are available in raggenie.
+
+### Structred Datasources
+* [Postgressql](./Postgressql)
+* [Airtable](./Airtable)
+* [Bigquery](./Bigquery)
+
+### Unstrunctured Datasources
+* [PDFs](./PDFs)
+* [Websites](./Websites)
diff --git a/documents/docs/Connectors/PDFs.md b/documents/docs/Connectors/PDFs.md
@@ -0,0 +1,14 @@
+---
+sidebar_position: 4
+---
+
+# PDFs Plugin
+
+### Plugin name
+The name of the plugin is used to differentiare between different connected plugins. These would be used for LLM calls during intent extraction.
+
+### Plugin Description
+A brief description of data in the plugin. This is used during LLM calls and may affect the quality of LLM response thus make sure that it is descriptive enough for good LLM output while being short enough to reduce LLM cost.
+
+### File upload
+The File Upload section allows users to upload PDF files into the plugin. These files are then used as a data source for LLM interactions, enabling the system to retrieve and extract relevant information when necessary.
diff --git a/documents/docs/Connectors/Postgressql.md b/documents/docs/Connectors/Postgressql.md
@@ -0,0 +1,35 @@
+---
+sidebar_position: 1
+---
+
+# Postgressql Plugin
+
+You can connect to an instance of postgress using the postgressql plugin.
+
+### Plugin name
+The name of the plugin is used to differentiare between different connected plugins. These would be used for LLM calls during intent extraction.
+
+### Plugin Description
+A brief description of data in the plugin. This is used during LLM calls and may affect the quality of LLM response thus make sure that it is descriptive enough for good LLM output while being short enough to reduce LLM cost.
+
+### Database sslmode
+SSL Mode determines whether SSL encryption should be used when connecting to the PostgreSQL database. This feature ensures that data transmitted between your raggenie and the database is secure.
+
+* sslmode=disable: No SSL is used when connecting to the database. This option can be used if the database server does not require encrypted connections or if encryption is not a priority. However, this may expose sensitive data to potential interception.
+
+* sslmode=require: Enforces the use of SSL for database connections. This is recommended for environments where sensitive data is transmitted or where security is a concern.
+
+### Database name
+The Database Name is the name of the PostgreSQL database that the raggenie will connect to. Each database instance can host multiple databases, and specifying the correct database name is crucial to ensure that your raggenie accesses the intended data.
+
+### Database host
+The Database Host refers to the URL or IP address where the PostgreSQL server is running. This could be a local server, a remote machine, or a cloud-hosted instance. Ensure that the specified host is reachable from your application's environment.
+
+### Database port
+The Database Port is the TCP/IP port on which the PostgreSQL server is listening. The default port for PostgreSQL is `5432`, but this can be configured to a different port based on your setup.
+
+### Password
+The password of the user trying to access the postgressql database.
+
+### User name
+The Username is the identity that the application uses to connect to the PostgreSQL database. Each user in PostgreSQL can have different permissions, and it is important to use a user with the necessary roles for the application's functionality.
diff --git a/documents/docs/Connectors/Websites.md b/documents/docs/Connectors/Websites.md
@@ -0,0 +1,14 @@
+---
+sidebar_position: 5
+---
+
+# Websites Plugin
+
+### Plugin name
+The name of the plugin is used to differentiare between different connected plugins. These would be used for LLM calls during intent extraction.
+
+### Plugin Description
+A brief description of data in the plugin. This is used during LLM calls and may affect the quality of LLM response thus make sure that it is descriptive enough for good LLM output while being short enough to reduce LLM cost.
+
+### Website URL
+The Website URL is the address of the website from which the plugin will fetch documents and data. The plugin will query this URL to retrieve the required content for use during LLM interactions.
diff --git a/documents/docs/Examples.md b/documents/docs/Examples.md
@@ -0,0 +1,5 @@
+---
+sidebar_position: 4
+---
+
+# Examples
diff --git a/documents/docs/How to configure raggenie/Configuration.md b/documents/docs/How to configure raggenie/Configuration.md
@@ -0,0 +1,25 @@
+---
+sidebar_position: 2
+---
+
+# Configuration
+
+## Configuration details
+You should provide a bot name, a short discription about the bot and a long discription about the bots usecase.
+Note:- Long dicription will be used when making LLM calls and thus will affect the performance of the chatbot. It is recomended to give detailed description that can help the LLM to understand its usecases.
+
+## Inference endpoint
+To add an LLM endpoint choose your LLM inference provider and specify a unique name to reference the particular model.
+![LLM inference plugin image](../../static/img/inferance_end_point.png?raw=true)
+
+Specify the model name, inference provider endpoint, and the API key.
+
+## Capabilities
+Capabilities can be defined to make your chatbot do custom actions such as fill a form or book a meeting. Currently actions can be defined to interact with your datasources or to webhooks.
+### Add Capability Name and Description
+Capability Name and discription is used by the intent extraction module to determine which capability is to be exicuted. So it is important to give a detailed discription of the capability.
+![Capability initialisation image](../../static/img/Capbilities.png?raw=true)
+### Add Capability Parameters
+You can specify the parameters nesessary to exicute an action. Raggenie uses LLM calls to see if all the specified parameters could be retreaved from the user input. In case if LLM could not detect all the nesessary parameters raggenie would ask the user to specify the missing parameters
+![Capability parameters image](../../static/img/Create_parameter.png?raw=true)
+these parameters can be used to trigger an action.
diff --git a/documents/docs/How to configure raggenie/Deploy.md b/documents/docs/How to configure raggenie/Deploy.md
@@ -0,0 +1,9 @@
+---
+sidebar_position: 4
+---
+
+# Deploy
+
+`Restart Chatbot` to apply all the changes that have been made to the chat bot. This restarts the backend and connections with the updated configurations.
+
+You can get the live preview URL from here to be shared with your end users.
diff --git a/documents/docs/How to configure raggenie/Plugins.md b/documents/docs/How to configure raggenie/Plugins.md
@@ -0,0 +1,19 @@
+---
+sidebar_position: 1
+---
+
+# Plugins
+
+## Configuration
+Plugin configuration is used to specify the metadata of different datasources such as datasource name, description and login details.
+You need to specify informations such as:
+* Plugin Name: Plugin name is used to differentiate between different connected plugins.
+* Database Description: Description is should contain a breafe description about the use case of the database. The description is used during LLM calls, thus more detailed descriptions may help to improve the relevance of LLM output. The decription should be between 100 and 200 characters to make sure that it is detailed enough while also keeping the token count low.
+* Database login details: These are specific for different plugins. Refer [Plugins](../Connectors) for more details
+after entering all the details use `connection test` button perform a health check. If the health check passes use `save & continue` to save the plugin.
+
+## Database schema
+Raggenie automatically fetches your schema from the database on saving the configuration. Edit and add descriptions for different tables and their related columns. These decriptions are used during LLM calls and is nessesary for usable LLM responses. After adding descriptions `save & continue`.
+
+## Documentation
+You can add documentation of the plugins. This can be used a add important details regarding the plugins, which helps to fully understand how a plugin functions and how to use it effectively. This can be used to include important conditions and criterias. This data would be split into chunks and retreaved along with the schema during RAG exicution, thus can help to get improved responses from the LLMs. Then `save & continue` to fully save the plugin.