A simple, multi-user, multi-conversation, web-based chatbot.
Chatbot incorporates OpenID Connect for user identification. It relies on an external OAuth Client oidc-authservice to handle authentication and set a trusted userid
Header to the downstream services. Alternatively, oauth2-proxy, which is more actively maintained, can be used in place of oidc-authservice
.
Chatbot supports multiple conversations. Each conversation is identified by a unique conversationId
. Conversations consists of a sequence of messages
as well as metadata such as title
, updatedAt
. Message persistance is handled by langchain's RedisChatMessageHistory
module, which leverages Redis for storing chat history. Metadata persistance is handled separately by redis-om, an object mapping library for Redis from Redis Labs. This separation of message content and metadata storage provides modularity and flexibility in the chatbot's underlying persistence architecture.
Chatbot supports streaming LLM outputs to the user in real-time. Streaming messages are delivered via WebSockets, which enables bidirectional, full-duplex communication channels between the server and client.
On the LLM side, Chatbot uses Text Generation Inference (TGI), an open source library from HuggingFace, to host large language models for text generation. TGI provides out-of-the-box support for continuous batching, streaming inference, and other useful features for deploying production-ready LLMs. Using TGI eliminates the need to build complex serving infrastructure from scratch. Its continuous batching allows the chatbot to achieve high throughput by batching requests. Streaming inference enables the chatbot to return partial results instantly rather than waiting for the full output.
Key | Default Value | Description |
---|---|---|
LOG_LEVEL | INFO |
log level |
REDIS_OM_URL | redis://localhost:6379 |
Redis url to persist messages and metadata |
INFERENCE_SERVER_URL | http://localhost:8080 |
model service url |