Text to Neo4j Knowledge Graph Builder

A comprehensive Python solution that transforms unstructured text into a queryable knowledge graph using Ollama's LLM and Neo4j, with an interactive exploration interface.

Features

📄 Batch processes multiple .txt files from an Input folder
🧠 Leverages Ollama's LLM (default: llama3.1:8B) for intelligent knowledge extraction
🗃️ Automated Neo4j graph construction with MERGE operations
🔄 Intelligent text chunking (512 chars with 50 overlap) via LangChain
🏷️ Automatic type normalization (e.g., "Chief Officer" → "ChiefOfficer")
🔗 Relationship standardization (UPPER_SNAKE_CASE)
🔍 Built-in NLP explorer with semantic search capabilities
📊 Interactive graph visualization and statistics

Project Structure

Text-to-Neo4j-Knowledge-Graph-Builder/
├── Text_to_Neo4J.py               # Main ETL pipeline
├── Neo4j_Data_Retrival_NLP.py      # Interactive explorer
├── Input/                          # Source text files
│   ├── sample1.txt
│   └── sample2.txt
├── requirements.txt                # Dependency spec
├── LICENSE                         # MIT License
└── README.md                       # This document

Prerequisites

Component	Version	Installation Guide
Python	3.8+	python.org
Neo4j	4.4+	neo4j.com/download
Ollama	Latest	ollama.ai
llama3.1:8B	-	`ollama pull llama3.1:8B`

Installation

Clone repository:

git clone https://github.com/Mrigank005/Text-to-Neo4j-Knowledge-Graph-Builder.git
cd Text-to-Neo4j-Knowledge-Graph-Builder

Set up environment:

pip install -r requirements.txt
python -m spacy download en_core_web_sm

Configure Neo4j:
- Start Neo4j Desktop/Server
- Set password in both scripts:
```
NEO4J_PASSWORD = "your_neo4j_password"  # In both .py files
```

Prepare Ollama:

ollama serve &  # Run in background
ollama pull llama3.1:8B

Workflow Overview

flowchart LR
    A[Input Texts] --> B(Text Chunking)
    B --> C{LLM Processing}
    C --> D[Entities]
    C --> E[Relationships]
    D --> F[(Neo4j Graph)]
    E --> F
    F --> G[[Explorer UI]]
    
    subgraph Extraction
    B -->|LangChain| C
    C -->|llama3.1:8B| D
    C -->|llama3.1:8B| E
    end
    
    subgraph Storage
    D -->|MERGE| F
    E -->|CREATE| F
    end
    
    subgraph Visualization
    F -->|Query| G
    end

Usage

1. Data Ingestion

python Text_to_Neo4J.py

Processes all .txt files in Input/
Shows real-time progress for each chunk
Outputs summary statistics

2. Graph Exploration

python Neo4j_Data_Retrival_NLP.py

Menu Options:

📊 Graph summary statistics
🔍 Node search (exact/NLP)
🕵️ Node detail inspection
🛣️ Pathfinding between nodes
🔄 Duplicate relationship check
🧠 Semantic search

Example Use Case

Input Text:

Apple Inc. was founded by Steve Jobs in 1976. The company develops consumer electronics like the iPhone.

Resulting Graph:

(:Company {id: "Apple Inc.", name: "Apple Inc."})-[:FOUNDED_BY]->(:Person {id: "Steve Jobs"})
(:Company {id: "Apple Inc."})-[:DEVELOPS]->(:Product {id: "iPhone"})

Configuration Reference

Parameter	Default Value	Description
`NEO4J_URI`	`bolt://localhost:7687`	Neo4j connection endpoint
`OLLAMA_MODEL`	`"llama3.1:8B"`	LLM model for extraction
`chunk_size`	`512`	Character count per text segment
`chunk_overlap`	`50`	Overlap between segments

Troubleshooting Guide

Issue	Solution Steps
Neo4j connection failed	1. Verify service is running 2. Check bolt:// URL 3. Confirm credentials
No files processed	1. Ensure `Input/` directory exists 2. Verify .txt extension
LLM timeout errors	1. Check `ollama serve` status 2. Reduce chunk_size 3. Try simpler model
Duplicate nodes/relationships	1. Pre-clear graph if needed 2. Check MERGE logic

Performance Tips

For large datasets:
- Increase chunk_size to 768-1024
- Use ollama pull llama3:70B for complex texts
- Process files sequentially with pauses
For better accuracy:
- Pre-process texts to remove noise
- Add domain-specific examples in prompts
- Use smaller chunk_overlap (20-30)

License

MIT License - See LICENSE for full text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text to Neo4j Knowledge Graph Builder

Features

Project Structure

Prerequisites

Installation

Workflow Overview

Usage

1. Data Ingestion

2. Graph Exploration

Example Use Case

Configuration Reference

Troubleshooting Guide

Performance Tips

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Input		Input
LICENSE		LICENSE
Neo4j_Data_Retrival_NLP.py		Neo4j_Data_Retrival_NLP.py
README.md		README.md
Text_to_Neo4J - Data Entry.py		Text_to_Neo4J - Data Entry.py
requirements.txt		requirements.txt

License

Mrigank005/Text-to-Neo4j-Knowledge-Graph-Builder

Folders and files

Latest commit

History

Repository files navigation

Text to Neo4j Knowledge Graph Builder

Features

Project Structure

Prerequisites

Installation

Workflow Overview

Usage

1. Data Ingestion

2. Graph Exploration

Example Use Case

Configuration Reference

Troubleshooting Guide

Performance Tips

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages