Skip to content

Latest commit

 

History

History
194 lines (146 loc) · 10.6 KB

README.md

File metadata and controls

194 lines (146 loc) · 10.6 KB

chat-search

CICD Artifact Hub

Chat with documents, search via natural language.

chat-search supports hybrid language models to add chat capabilities to website. RAG built with LangChain, Redis, various model providers (OpenAI, Ollama, vLLM, Huggingface).

Demo: Chat about my blog

Usage

Setup .env

cp .env.example .env

Populate .env file with the required environment variables.

Name Value Default
AUTH_TOKEN auto token used for ingest
CHAT_PROVIDER model provider, openai or ollama openai
DEBUG enable DEBUG, 1 or 0 0
DIGEST_PREFIX prefix for digest in Redis digest
DOCUMENT_CONTENT_DESCRIPTION document content description Document content
EMBEDDING_DIM embedding dimensions 1536
EMBEDDING_PROVIDER embedding provider, openai or ollama or huggingface openai
ENABLE_FEEDBACK_ENDPOINT enable feedback endpoint, 1 or 0 1
ENABLE_PUBLIC_TRACE_LINK_ENDPOINT enable public trace link endpoint, 1 or 0 1
FULLTEXT_RETRIEVER_SEARCH_K fulltext retriever search result number 4
FULLTEXT_RETRIEVER_SELF_QUERY whether to enable fulltext retriever self query, 1 or 0 1
FULLTEXT_RETRIEVER_WEIGHT fulltext retriever weight 0.5
HEADERS_TO_SPLIT_ON html headers to split text h1,h2,h3
HF_HUB_EMBEDDING_MODEL huggingface hub embedding model or Text Embeddings Inference url http://localhost:8080
INDEX_NAME index name document
INDEX_SCHEMA_PATH index schema path (will use app/schema.yaml)
LANGCHAIN_API_KEY langchain api key for langsmith
LANGCHAIN_ENDPOINT langchain endpoint for langsmith https://api.smith.langchain.com
LANGCHAIN_PROJECT langchain project for langsmith default
LANGCHAIN_TRACING_V2 enable langchain tracing v2 true
LLM_TEMPERATURE temperature for LLM 0
MERGE_SYSTEM_PROMPT merge system prompt with user input, for models not support system role, 1 or 0 0
OLLAMA_CHAT_MODEL ollama chat model llama3
OLLAMA_EMBEDDING_MODEL ollama embedding model nomic-embed-text
OLLAMA_URL ollama url http://localhost:11434
OPENAI_API_BASE openai compatible api base url https://api.openai.com/v1
OPENAI_API_KEY openai api key EMPTY
OPENAI_CHAT_MODEL openai chat model gpt-4o-mini
OPENAI_EMBEDDING_MODEL openai embedding model text-embedding-3-small
OTEL_SDK_DISABLED disable OpenTelemetry, false or true false
OTEL_SERVICE_NAME OpenTelemetry service name, also used for Pyroscope application name chat-search
PYROSCOPE_BASIC_AUTH_PASSWORD Pyroscope basic auth password
PYROSCOPE_BASIC_AUTH_USERNAME Pyroscope basic auth username
PYROSCOPE_ENABLED Enable Pyroscope or not, 1 or 0 1
PYROSCOPE_SERVER_ADDRESS Pyroscope server address http://localhost:4040
REDIS_URL redis url redis://localhost:6379/
REPHRASE_PROMPT prompt for rephrase check config.py
RETRIEVAL_QA_CHAT_SYSTEM_PROMPT prompt for retrieval check config.py
RETRIEVER_SEARCH_K retriever search result number 4
RETRIEVER_SELF_QUERY_EXAMPLES retriever self query examples as json check config.py
TEXT_SPLIT_CHUNK_OVERLAP chunk overlap for text split 200
TEXT_SPLIT_CHUNK_SIZE chunk size for text split 4000
VECTORSTORE_RETRIEVER_SEARCH_KWARGS search kwargs for redis vectorstore retriever as json check config.py
VECTORSTORE_RETRIEVER_SEARCH_TYPE search type for redis vectorstore retriever mmr
VECTORSTORE_RETRIEVER_SELF_QUERY whether to enable vectorstore retriever self query, 1 or 0 1
VECTORSTORE_RETRIEVER_WEIGHT vectorstore retriever weight 0.5
VERBOSE enable verbose, 1 or 0 0

Start Ollama (Optional)

Follow Ollama instructions

ollama serve
ollama pull llama3
ollama pull nomic-embed-text

Run on host

Install dependencies

pip install poetry==1.7.1
poetry shell
poetry install

Start dependencies

Start redis

docker compose -f compose.redis.yaml up

Launch LangServe

langchain serve

Visit http://localhost:8000/

Run in Docker

There is a compose.yml file for running the app and all dependencies in containers. Suitable for local end to end testing.

docker compose up --build

Visit http://localhost:8000/

Run in Kubernetes

There is a helm chart for deploying the app in Kubernetes.

Config Helm values

Using Helm

cp values.example.yaml values.yaml

Then update values.yaml accordingly.

Add helm repos:

helm repo add chat-search https://hemslo.github.io/chat-search/
helm repo add redis-stack https://redis-stack.github.io/helm-redis-stack/
helm repo add ollama-helm https://otwld.github.io/ollama-helm/

Install/Upgrade chat-search

helm upgrade -i --wait my-chat-search chat-search/chat-search -f values.yaml

Using Skaffold for local development

skaffold run --port-forward

Ingest data

crawl --sitemap-url $SITEMAP_URL --auth-token $AUTH_TOKEN

Check crawl.yml for web crawling,

Example auto ingest after Github Pages deploy, jekyll.yml.

Architecture

Ingest

flowchart LR
  A(Crawl) --> |doc| B(/ingest)
  B --> |metadata| C(Redis)
  B --> |doc| D(Text Splitter)
  D --> |docs| E(Embedding Model)
  E --> |docs with embeddings| F(Redis)
Loading

Query

flowchart LR
  A((Request)) --> |messages| B(/chat)
  B --> |messages| C(LLM)
  C --> |question| D(Embedding Model)
  D --> |embeddings| E(Redis)
  E --> |relevant docs| F(LLM)
  B --> |messages|F
  F --> |answer| G((Response))
Loading

Deployment

Check cicd.yml for Google Cloud Run deployment, deploy-to-cloud-run.