Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update before production #6

Merged
merged 2 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 39 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,31 +34,55 @@ En ce base sur le [client officiel python d'OpenAI](https://github.com/openai/op

Ce formalisme permet d'intégrer facilement l'API Albert avec des librairies tierces comme [Langchain](https://www.langchain.com/) ou [LlamaIndex](https://www.llamaindex.ai/).

### Multi models
### Converser avec un modèle de langage (chat memory)

Albert API intègre nativement la mémorisation des messages pour les conversations sans surcharger d'arguments le endpoint `/v1/chat/completions` par rapport à la documentation d'OpenAI. Cela consiste à envoyer à chaque requête au modèle l'historique de la conversation pour lui fournir le contexte.

> 📖 [Notebook de démonstration](./tutorials/chat_completions.ipynb)

### Accéder à plusieurs modèles de langage (multi models)

Grâce à un fichier de configuration (*[config.example.yml](./config.example.yml)*) vous pouvez connecter autant d'API de modèles que vous le souhaitez. L'API Albert se charge de mutualiser l'accès à tous ces modèles dans une unique API. Vous pouvez constater les différents modèles accessibles en appelant le endpoint `/v1/models`.

> 📖 [Notebook de démonstration](./tutorials/models.ipynb)

### Chat history
### Fonctionnalités avancées (tools)

Albert API intègre nativement la mémorisation des messages pour les conversations sans surcharger d'arguments le endpoint `/v1/chat/completions` par rapport à la documentation d'OpenAI. Cela consiste à envoyer à chaque requête au modèle l'historique de la conversation pour lui fournir le contexte.
Les tools sont une fonctionnalité définie OpenAI que l'on surcharge dans le cas de l'API Albert pour permettre de configurer des tâches spéficiques comme du RAG ou le résumé. Vous pouvez appelez le endpoint `/tools` pour voir la liste des tools disponibles.

> 📖 [Notebook de démonstration](./tutorials/chat_completions.ipynb)
![](./assets/chatcompletion.png)

### Tools (multi agents, RAG, résumé...)
#### Interroger des documents (RAG)

Les tools sont une fonctionnalité définie OpenAI que l'on surcharge dans le cas de l'API Albert pour permettre de configurer des tâches spéficiques comme du RAG ou le résumé. Vous pouvez appelez le endpoint `/tools` pour voir la liste des tools disponibles.
> 📖 [Notebook de démonstration](./tutorials/retrival_augmented_generation.ipynb)

> 📖 [Notebook de démonstration : RAG](./tutorials/retrival_augmented_generation.ipynb)
### Résumer un document (summarize)

![](./assets/chatcompletion.png)
> 📖 [Notebook de démonstration](./tutorials/summarize.ipynb)

### Accès par token
## Déployer l'API Albert

Albert API permet de protégrer son accès avec un ou plusieurs tokens d'authentification, voir la section [Auth](#auth) pour plus d'informations.
### Quickstart

## Configuration
1. Installez [libmagic](https://man7.org/linux/man-pages/man3/libmagic.3.html)

2. Installez les packages Python

```bash
cd app
pip install .
```

3. Créez un fichier *config.yml* à la racine du repository sur la base du fichier d'exemple *[config.example.yml](./config.example.yml)*

Si vous souhaitez configurer les accès aux modèles et aux bases de données, consultez la [Configuration](#configuration).

Pour lancer l'API :
```bash
uvicorn app.main:app --reload --port 8080 --log-level debug
```

### Configuration

Toute la configuration de l'API Albert se fait dans fichier de configuration qui doit respecter les spécifications suivantes (voir *[config.example.yml](./config.example.yml)* pour un exemple) :

Expand Down Expand Up @@ -102,13 +126,13 @@ CONFIG_FILE=<path_to_the_file> uvicorn main:app --reload --port 8080 --log-level

La configuration permet de spéficier le token d'accès à l'API, les API de modèles auquel à accès l'API d'Albert ainsi que les bases de données nécessaires à sont fonctionnement.

### Auth
#### Auth

Les IAM supportés, de nouveaux seront disponibles prochainements :

* [Grist](https://www.getgrist.com/)

### Databases
#### Databases

3 bases de données sont à configurées dans le fichier de configuration (*[config.example.yml](./config.example.yml)*) :
* vectors : pour le vector store
Expand All @@ -125,6 +149,8 @@ Voici les types de base de données supportées, de nouvelles seront disponibles

## Tests

Vous pouvez vérifier le bon déploiement de votre API à l'aide en exécutant des tests unitaires :

```bash
cd app/tests
CONFIG_FILE="../../config.yml" pytest test_models.py
Expand Down
7 changes: 0 additions & 7 deletions app/endpoints/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +0,0 @@
from .chat import router as ChatRouter
from .collections import router as CollectionsRouter
from .completions import router as CompletionsRouter
from .embeddings import router as EmbeddingsRouter
from .files import router as FilesRouter
from .models import router as ModelsRouter
from .tools import router as ToolsRouter
36 changes: 17 additions & 19 deletions app/endpoints/chat.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,19 @@
import uuid
import sys

from typing import Optional, Union

from fastapi import APIRouter, Security, HTTPException
from fastapi.responses import StreamingResponse

sys.path.append("..")
from schemas.chat import (
from app.schemas.chat import (
ChatHistory,
ChatHistoryResponse,
ChatCompletionRequest,
ChatCompletionResponse,
)
from utils.security import check_api_key, secure_data
from utils.lifespan import clients
from tools import *
from tools import __all__ as tools_list
from app.utils.security import check_api_key, secure_data
from app.utils.lifespan import clients
from app.tools import *
from app.tools import __all__ as tools_list


router = APIRouter()
Expand All @@ -37,6 +35,17 @@ async def chat_completions(
except KeyError:
raise HTTPException(status_code=404, detail="Model not found.")

if request["user"]:
# retrieve chat history
if chat_id:
chat_history = clients["chathistory"].get_chat_history(
user_id=request["user"], chat_id=chat_id
)
if "messages" in chat_history.keys(): # to avoid empty chat history
request["messages"] = chat_history["messages"] + request["messages"]
else:
chat_id = str(uuid.uuid4())

# tools
user_message = request["messages"][-1] # keep user message without tools for chat history
tools = request.get("tools")
Expand All @@ -57,17 +66,6 @@ async def chat_completions(
request["messages"] = [{"role": "user", "content": prompt}]
request.pop("tools")

if request["user"]:
# retrieve chat history
if chat_id:
chat_history = clients["chathistory"].get_chat_history(
user_id=request["user"], chat_id=chat_id
)
if "messages" in chat_history.keys(): # to avoid empty chat history
request["messages"] = chat_history["messages"] + request["messages"]
else:
chat_id = str(uuid.uuid4())

# non stream case
if not request["stream"]:
response = client.chat.completions.create(**request)
Expand Down
8 changes: 3 additions & 5 deletions app/endpoints/collections.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
import sys
import re

from fastapi import APIRouter, Security

sys.path.append("..")
from schemas.collections import CollectionResponse
from utils.security import check_api_key
from utils.lifespan import clients
from app.schemas.collections import CollectionResponse
from app.utils.security import check_api_key
from app.utils.lifespan import clients

router = APIRouter()

Expand Down
9 changes: 3 additions & 6 deletions app/endpoints/completions.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
import sys

from fastapi import APIRouter, Security

sys.path.append("..")
from schemas.completions import CompletionRequest, CompletionResponse
from utils.lifespan import clients
from utils.security import check_api_key
from app.schemas.completions import CompletionRequest, CompletionResponse
from app.utils.lifespan import clients
from app.utils.security import check_api_key


router = APIRouter()
Expand Down
11 changes: 4 additions & 7 deletions app/endpoints/embeddings.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
import sys

from fastapi import APIRouter, Security

sys.path.append("..")
from schemas.embeddings import EmbeddingsRequest, EmbeddingResponse
from utils.lifespan import clients
from utils.security import check_api_key
from app.schemas.embeddings import EmbeddingsRequest, EmbeddingResponse
from app.utils.lifespan import clients
from app.utils.security import check_api_key


router = APIRouter()
Expand All @@ -24,4 +21,4 @@ async def embeddings(
client = clients["openai"][request["model"]]
response = client.embeddings.create(**request)

return EmbeddingResponse(**response)
return response
12 changes: 5 additions & 7 deletions app/endpoints/files.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import base64
import uuid
import sys

from typing import List, Optional, Union

Expand All @@ -10,12 +9,11 @@
from langchain_community.vectorstores import Qdrant as VectorStore
from qdrant_client.http import models as rest

sys.path.append("..")
from schemas.files import File, FileResponse, FileUploadResponse
from utils.config import logging
from utils.security import check_api_key, secure_data
from utils.lifespan import clients
from helpers import S3FileLoader
from app.schemas.files import File, FileResponse, FileUploadResponse
from app.utils.config import logging
from app.utils.security import check_api_key, secure_data
from app.utils.lifespan import clients
from app.helpers import S3FileLoader

router = APIRouter()

Expand Down
8 changes: 3 additions & 5 deletions app/endpoints/models.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
import urllib
from typing import Union, Optional
import sys

from fastapi import APIRouter, Security

sys.path.append("..")
from schemas.models import Model, ModelResponse
from utils.lifespan import clients
from utils.security import check_api_key
from app.schemas.models import Model, ModelResponse
from app.utils.lifespan import clients
from app.utils.security import check_api_key


router = APIRouter()
Expand Down
11 changes: 4 additions & 7 deletions app/endpoints/tools.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
import sys

from fastapi import APIRouter, Security

sys.path.append("..")
from schemas.tools import ToolResponse
from utils.security import check_api_key
from tools import *
from tools import __all__ as tools_list
from app.schemas.tools import ToolResponse
from app.utils.security import check_api_key
from app.tools import *
from app.tools import __all__ as tools_list

router = APIRouter()

Expand Down
15 changes: 0 additions & 15 deletions app/helpers/_universalparser.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from docx import Document
from langchain.text_splitter import RecursiveCharacterTextSplitter

# from llmsherpa.readers import LayoutPDFReader
from langchain_community.document_loaders import PDFMinerLoader
from langchain.docstore.document import Document as langchain_doc
import magic
Expand Down Expand Up @@ -100,20 +99,6 @@ def _pdf_to_chunks(
list: List of Langchain documents, where each document corresponds to a text chunk.
"""

# Llmsherpa is replaced by PDFMiner for now

# llmsherpa_api_url = (
# "https://readers.llmsherpa.com/api/document/developer/parseDocument?renderFormat=all"
# )
# pdf_reader = LayoutPDFReader(llmsherpa_api_url)
# doc = pdf_reader.read_pdf(file_path)

# def concatene_chunks(doc):
# chunks = []
# for chunk in doc.chunks():
# chunks.append(chunk.to_text())
# return "\n".join(chunks)

loader = PDFMinerLoader(file_path)
doc = loader.load()

Expand Down
39 changes: 19 additions & 20 deletions app/main.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,18 @@
from fastapi import FastAPI, Security, Response

from utils.lifespan import lifespan
from utils.security import check_api_key
from endpoints import (
ChatRouter,
CollectionsRouter,
CompletionsRouter,
EmbeddingsRouter,
FilesRouter,
ModelsRouter,
ToolsRouter,
)
from app.utils.lifespan import lifespan
from app.utils.security import check_api_key
from app.endpoints import chat, completions, collections, embeddings, files, models, tools
from app.utils.config import APP_CONTACT_URL, APP_CONTACT_EMAIL, APP_VERSION, APP_DESCRIPTION

# @TODO: add metadata: https://fastapi.tiangolo.com/tutorial/metadata/
app = FastAPI(title="Albert API", version="1.0.0", lifespan=lifespan)
app = FastAPI(
title="Albert API",
version=APP_VERSION,
description=APP_DESCRIPTION,
contact={"url": APP_CONTACT_URL, "email": APP_CONTACT_EMAIL},
licence_info={"name": "MIT License", "identifier": "MIT"},
lifespan=lifespan,
)


@app.get("/health")
Expand All @@ -25,10 +24,10 @@ def health(api_key: str = Security(check_api_key)):
return Response(status_code=200)


app.include_router(ModelsRouter, tags=["Models"], prefix="/v1")
app.include_router(ChatRouter, tags=["Chat"], prefix="/v1")
app.include_router(CompletionsRouter, tags=["Completions"], prefix="/v1")
app.include_router(EmbeddingsRouter, tags=["Embeddings"], prefix="/v1")
app.include_router(CollectionsRouter, tags=["Collections"], prefix="/v1")
app.include_router(FilesRouter, tags=["Files"], prefix="/v1")
app.include_router(ToolsRouter, tags=["Tools"], prefix="/v1")
app.include_router(models.router, tags=["Models"], prefix="/v1")
app.include_router(chat.router, tags=["Chat"], prefix="/v1")
app.include_router(completions.router, tags=["Completions"], prefix="/v1")
app.include_router(embeddings.router, tags=["Embeddings"], prefix="/v1")
app.include_router(collections.router, tags=["Collections"], prefix="/v1")
app.include_router(files.router, tags=["Files"], prefix="/v1")
app.include_router(tools.router, tags=["Tools"], prefix="/v1")
2 changes: 1 addition & 1 deletion app/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ dependencies = [
"docx==0.2.4",
"pyyaml==6.0.1",
"python-docx==1.1.2",
"llmsherpa==0.1.4",
"unstructured==0.14.9",
"python-magic==0.4.27",
"grist-api==0.1.0",
"pdfminer.six==20240706",
]

[tool.setuptools]
Expand Down
5 changes: 1 addition & 4 deletions app/tests/test_chat.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
import sys

from fastapi.testclient import TestClient

sys.path.append("..")
from main import app
from app.main import app

model = "AgentPublic/llama3-instruct-8b"
prompt = "Hello world !"
Expand Down
5 changes: 1 addition & 4 deletions app/tests/test_models.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
import sys

from fastapi.testclient import TestClient

sys.path.append("..")
from main import app
from app.main import app


def test_get_models():
Expand Down
Loading