Skip to content

A comprehensive GraphRAG (Graph Retrieval-Augmented Generation) system designed for financial research

License

Notifications You must be signed in to change notification settings

saakshigupta2002/FinGraphRAG

Repository files navigation

FinGraphRAG

FinGraphRAG

Portfolio Knowledge Graph & Risk Analysis
Graph-enhanced RAG (GraphRAG) for explainable portfolio insights using Neo4j + Qdrant + FastAPI.

License Python FastAPI Neo4j Qdrant Docker


TL;DR

FinGraphRAG is a full‑stack demo that shows how to combine:

  • Vector search (Qdrant) for semantic retrieval over filings/news chunks
  • Knowledge graph reasoning (Neo4j) for relationship-aware expansion (peers, sectors, events)
  • LLM synthesis (OpenAI-compatible API) to generate an answer with structured outputs

It ships with:

  • A FastAPI backend with minimal, clean v1 endpoints
  • A React + Vite frontend to upload a portfolio and run analyses
  • Docker Compose for one-command local setup (Neo4j + Qdrant + backend + frontend)

Contents


Demo visuals

✅ Add your project screenshots to docs/images/ (or any folder you prefer) and update the paths below.

Portfolio knowledge graph (concept)

Gemini_Generated_Image_q732h1q732h1q732-2

Portfolio holdings graph

Portfolio_Holdings_Graph

Full graph snapshot (real run)

knowledge_graph


Services

Service Purpose Tech
Neo4j Knowledge graph store (companies, sectors, events, filings, portfolios) Neo4j 5.x
Qdrant Vector store for semantic retrieval over text chunks Qdrant
Backend API + orchestration pipeline (GraphRAG retrieval + synthesis) FastAPI + Python
Frontend Portfolio upload + analysis UI React + Vite

Data flow (high level)

Step What happens
1. Portfolio submitted User sends tickers + weights to backend (/api/v1/portfolio)
2. Data ingestion SEC filings + news/event text can be fetched/ingested (extensible pipeline)
3. Indexing Text is chunked + embedded; embeddings stored in Qdrant
4. Graph build Entities + relationships stored in Neo4j (holdings, sectors, peers, events, filings)
5. Hybrid retrieval Vector search finds relevant chunks; graph expansion gathers related context
6. Response LLM synthesizes a structured answer + citations + chart/table payload

Quickstart (Docker)

Prerequisites

  • Docker + Docker Compose
  • An OpenAI API key (optional but recommended for best results)
  • A valid SEC User-Agent (required for SEC endpoints; include contact email)

Run the full stack

# From repo root
cd infra

# Create env file
cp env.example .env

# Edit infra/.env and set at least:
# - OPENAI_API_KEY=...
# - SEC_USER_AGENT=FinGraphRAG/1.0 (youremail@example.com)

docker compose up --build

Open the app

  • Frontend: http://localhost:3000
  • Backend health: http://localhost:8000/api/v1/health
  • Backend Swagger UI (FastAPI): http://localhost:8000/docs

Seed a demo portfolio

bash infra/scripts/seed_demo_portfolio.sh

API (v1)

FinGraphRAG intentionally keeps v1 simple.

Endpoint Method Description
/api/v1/health GET Health check
/api/v1/portfolio POST Create/update a portfolio by session_id
/api/v1/query POST Run an analysis query for a portfolio session

1) Create / update portfolio

curl -s -X POST http://localhost:8000/api/v1/portfolio \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "demo",
    "holdings": [
      {"ticker": "AAPL", "weight": 0.50},
      {"ticker": "MSFT", "weight": 0.30},
      {"ticker": "AMZN", "weight": 0.20}
    ]
  }' | jq

2) Run a query

curl -s -X POST http://localhost:8000/api/v1/query \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "demo",
    "question": "What is my biggest sector exposure and what risk does it imply?",
    "mode": "auto"
  }' | jq

Supported mode values (v1):

Mode Intent
auto Let router choose tools/strategy
risk Portfolio risk-style analysis
event_impact News/event impact reasoning
company_overview Single-company summary within portfolio context

Response shape (v1):

Field Type Notes
summary string Final answer
citations list Sources (SEC filings/news) when available
table object Tabular payload for UI
chart_data object Labels + values for charts
warnings list Any limitations / missing data

Local development

Recommended when iterating on backend logic or frontend UI without rebuilding Docker images.

Backend (FastAPI)

cd backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run dev server
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend (Vite)

cd frontend
npm install
npm run dev

Update the frontend API base URL in frontend/src/api/client.ts (or via env) if needed.


Configuration

Copy infra/env.exampleinfra/.env.

Variable Required Example Why
NEO4J_PASSWORD neo4j_pw Neo4j auth
OPENAI_API_KEY ⛔/✅ sk-... Enables LLM routing + synthesis
OPENAI_MODEL gpt-4o-mini Model used for synthesis
SEC_USER_AGENT ✅ (if SEC fetch enabled) FinGraphRAG/1.0 (you@domain.com) SEC compliance
LOOKBACK_DAYS 7 Recency window for events
VECTOR_TOP_K 5 Retrieval depth

Repo structure

FinGraphRAG/
├── backend/
│   ├── main.py
│   ├── requirements.txt
│   └── src/fin_graphrag/
│       ├── api/                 # FastAPI routes + schemas
│       ├── ingestion/           # chunking, embeddings, SEC/news fetch
│       ├── retrieval/           # vector retrieval + fusion
│       ├── graph/               # Neo4j client, schema, migrations
│       ├── orchestrator/        # router + tool registry + pipeline
│       ├── llm/                 # prompts + LLM client
│       └── domain/              # risk metrics, scoring utilities
├── frontend/
│   └── src/                     # React UI
├── infra/
│   ├── docker-compose.yml
│   ├── env.example
│   ├── neo4j/migrations/
│   └── scripts/
├── docs/
│   ├── api/openapi_v1.yaml
│   └── architecture/
├── Makefile
└── LICENSE

Handy Make targets

make up        # start stack
make down      # stop stack
make build     # rebuild images
make seed-demo # seed demo portfolio
make test      # run pytest

Roadmap

Version Focus
v1 (current) Minimal API + GraphRAG skeleton + UI + Docker stack
v2 Stronger ingestion (SEC/news), richer graph schema, improved citations
v3 Risk scoring models, dashboards, alerting, eval harness

License

MIT — see LICENSE.

About

A comprehensive GraphRAG (Graph Retrieval-Augmented Generation) system designed for financial research

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published