Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ollama and openai Fastapi api services #482

Merged
merged 18 commits into from
Dec 20, 2024
Merged

Conversation

ParisNeo
Copy link
Contributor

Add FastAPI Services for LightRAG Integration

Overview

Added two FastAPI services that provide REST API endpoints for utilizing LightRAG in distributed applications:

  1. Ollama-based FastAPI service
  2. OpenAI-based FastAPI service

These services enable easy integration of LightRAG capabilities into existing applications through HTTP endpoints.

Key Features

Common Features

  • Multiple search modes (naive, local, global, hybrid)
  • Streaming and non-streaming query responses
  • Document management (insert, upload, batch processing)
  • Health monitoring endpoints
  • Automatic API documentation (Swagger/ReDoc)
  • Configurable working directories
  • Asynchronous operation support

Ollama Service Specific

  • Configurable Ollama host
  • Support for various Ollama models
  • Default integration with mistral-nemo and bge-m3
  • Adjustable async operation limits
  • Custom embedding dimensions

OpenAI Service Specific

  • OpenAI API integration
  • Support for latest GPT models
  • text-embedding-3-large integration
  • Automatic embedding dimension detection
  • nest-asyncio implementation for better async handling

Configuration Options

Ollama Service

--host: Server host (default: 0.0.0.0)
--port: Server port (default: 9621)
--model: LLM model name (default: mistral-nemo:latest)
--embedding-model: Embedding model (default: bge-m3:latest)
--ollama-host: Ollama host URL
--working-dir: RAG storage location
--max-async: Maximum concurrent operations
--max-tokens: Token limit
--embedding-dim: Embedding dimensions
--max-embed-tokens: Embedding token limit

OpenAI Service

--host: Server host (default: 0.0.0.0)
--port: Server port (default: 9621)
--model: OpenAI model (default: gpt-4)
--embedding-model: Embedding model (default: text-embedding-3-large)
--working-dir: RAG storage location
--max-tokens: Token limit
--max-embed-tokens: Embedding token limit

API Endpoints

Both services provide:

  • /query: Document querying
  • /query/stream: Streaming responses
  • /documents/text: Text insertion
  • /documents/file: File upload
  • /documents/batch: Batch file processing
  • /documents/scan: Directory scanning
  • /health: System status

Usage Examples

Query Example

curl -X POST "http://localhost:9621/query" \
    -H "Content-Type: application/json" \
    -d '{"query": "Your question", "mode": "hybrid"}'

Document Upload

curl -X POST "http://localhost:9621/documents/file" \
    -F "file=@document.txt"

Testing

  • Tested with various document sizes
  • Verified streaming functionality
  • Confirmed batch processing capabilities
  • Validated error handling
  • Checked memory management

Documentation

  • Included detailed README for both services
  • Added API documentation
  • Provided configuration guides
  • Included usage examples

Future Improvements

  • Add support for more model providers
  • Implement caching mechanisms
  • Add authentication/authorization
  • Enhance error handling
  • Add monitoring metrics

Dependencies

  • FastAPI
  • Uvicorn
  • LightRAG
  • Pydantic
  • OpenAI/Ollama clients
  • Python 3.8+

This PR significantly enhances LightRAG's usability in distributed environments and makes it easier to integrate with existing applications.

@LarFii
Copy link
Collaborator

LarFii commented Dec 19, 2024

Thanks for your contribution! But there are some linting errors. Please make sure to run pre-commit run --all-files before submitting to ensure all linting checks pass.

@ParisNeo
Copy link
Contributor Author

Hi there. Thanks alot for answering me.
I just run the linting fix.

Best regards

@ParisNeo
Copy link
Contributor Author

By the way, why don't you put this directly as a github action to automatically apply linting to the project?

@LarFii
Copy link
Collaborator

LarFii commented Dec 20, 2024

Thank you again for your incredible contribution! It seems that automation can't fix all the linting errors, so some manual adjustments are still needed.

@LarFii LarFii merged commit e5dc186 into HKUDS:main Dec 20, 2024
1 check passed
@ParisNeo
Copy link
Contributor Author

You are welcome.
I even added lightrag to my Tool lollms as a service. So the user can setup a server with lightrag then he can use lollms as front end to chat with his AI using his vectorized data.

For now there is ollama and openai, But I'll add more services for other backends.

Thanks for accepting my contribution.
Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants