Zotero Database Analyzer for Literature Review Fast Composing - A comprehensive Python package for analyzing Zotero databases and generating structured literature reviews for LLM agents.
- Fetch literature items from personal or group Zotero libraries
- Advanced filtering by tags, collections, authors, keywords, date ranges, and item types
- Full metadata extraction including abstracts, DOIs, BibTeX citations
- Search functionality across your entire library
- Automatic categorization based on user-defined keywords
- Support for multiple classification schemes
- Smart content analysis for grouping related papers
- JSON format for structured data processing
- Markdown format optimized for LLM consumption
- Specialized context files for literature review composition
- Support for both individual items and categorized collections
- Model Context Protocol (MCP) interface for seamless agent integration
- Tools for fetching, categorizing, and exporting literature data
- Designed for use with Claude, GPT-4, and other LLM agents
- Perfect for automated literature review generation
pip install zoterodb-analyzergit clone https://github.com/MasterYip/ZoteroDB-Analyzer.git
cd ZoteroDB-Analyzer
pip install -e .pip install zoterodb-analyzer[mcp]Set up Zotero credentials:
# Linux Bash
export ZOTERO_API_KEY="your_api_key"
export ZOTERO_LIBRARY_ID="your_user_id"
# Windows Cmd
set ZOTERO_API_KEY="your_api_key"
set ZOTERO_LIBRARY_ID="your_user_id"
# Windows PowerShell
$env:ZOTERO_LIBRARY_ID='your_user_id'
$env:ZOTERO_API_KEY='your_api_key'Run the examples:
python examples/basic_usage.pyTry the CLI:
zoterodb-analyzer --helpFirst, get your Zotero API credentials:
your_api_key. Go to Zotero Settings to create a new private key with library access.your_user_id. Go to your user profile and the URL ishttps://www.zotero.org/<your_user_name>/<your_user_id>.
set ZOTERO_LIBRARY_ID=your_user_id
set ZOTERO_API_KEY=your_api_key$env:ZOTERO_LIBRARY_ID='your_user_id'
$env:ZOTERO_API_KEY='your_api_key'- Press
Win+R, typesysdm.cpl, press Enter - Go to Advanced > Environment Variables
- Add
ZOTERO_LIBRARY_IDandZOTERO_API_KEYas new variables
export ZOTERO_LIBRARY_ID='your_user_id'
export ZOTERO_API_KEY='your_api_key'To make it permanent, add to ~/.bashrc or ~/.zshrc:
echo 'export ZOTERO_LIBRARY_ID="your_user_id"' >> ~/.bashrc
echo 'export ZOTERO_API_KEY="your_api_key"' >> ~/.bashrcfrom zoterodb_analyzer import ZoteroAnalyzer, ContentExporter, FilterCriteria, LiteratureCategory
# Initialize analyzer
analyzer = ZoteroAnalyzer(
library_id="your_user_id",
library_type="user", # or "group"
api_key="your_api_key"
)
# Fetch items with filtering
filter_criteria = FilterCriteria(
tags=["machine learning", "robotics"],
date_range=(2020, 2024),
item_types=[ItemType.JOURNAL_ARTICLE]
)
items = analyzer.fetch_items(filter_criteria, limit=50)
print(f"Found {len(items)} items")
# Export for LLM consumption
exporter = ContentExporter("output")
exported_files = exporter.export_items(items, format=ExportFormat.MARKDOWN)
print(f"Exported to: {exported_files['markdown']}")# Define literature categories
categories = [
LiteratureCategory(
name="Diffusion Models",
description="Papers on diffusion models in robotics",
keywords=["diffusion", "denoising", "generative model"]
),
LiteratureCategory(
name="Reinforcement Learning",
description="RL approaches for robot control",
keywords=["reinforcement learning", "policy gradient", "Q-learning"]
)
]
# Categorize items
categorized_items = analyzer.categorize_items(items, categories)
# Export categorized literature for LLM context
exported_files = exporter.export_categorized_items(
categorized_items,
format=ExportFormat.BOTH
)
# Create LLM-optimized context file
llm_context = exporter.export_for_llm_context(
categorized_items,
context_type="related_works"
)The package includes a powerful CLI for easy automation:
# Fetch and export literature
zoterodb-analyzer fetch \
--library-id YOUR_USER_ID \
--api-key YOUR_API_KEY \
--tags "machine learning,robotics" \
--year-range 2020-2024 \
--format both \
--categories-file categories.json
# List available collections
zoterodb-analyzer collections --library-id YOUR_USER_ID --api-key YOUR_API_KEY
# Search your library
zoterodb-analyzer search \
--library-id YOUR_USER_ID \
--api-key YOUR_API_KEY \
--query "deep learning" \
--limit 20Create a categories.json file to define your literature categories:
[
{
"name": "Diffusion Models",
"description": "Papers on diffusion models and generative approaches",
"keywords": ["diffusion", "denoising", "DDPM", "score-based"]
},
{
"name": "Robot Learning",
"description": "Learning approaches for robotics",
"keywords": ["robot learning", "imitation learning", "demonstration"]
}
]The package provides a Model Context Protocol server for seamless integration with LLM agents:
from zoterodb_analyzer.mcp_server import ZoteroMCPServer
# Initialize MCP server
mcp_server = ZoteroMCPServer(
default_library_id="your_user_id",
default_api_key="your_api_key"
)
# Available tools for agents:
# - fetch_literature: Get literature with filtering
# - categorize_literature: Categorize and export literature
# - search_literature: Search library contents
# - get_collections: List available collections
# - get_tags: Get library tags
# - export_for_llm: Create LLM-optimized exportsTo integrate ZoteroDB Analyzer with VS Code Copilot, follow these steps:
First, ensure the package is installed:
# Install the package
pip install -e .Add the following configuration to your VS Code Copilot settings. Open your VS Code settings and add this MCP server configuration:
{
"mcp": {
"servers": {
"MCP_ZoteroDB": {
"type": "stdio",
"command": "python",
"args": [
"E:\\<path-to-this-pkg>\\mcp_server_runner.py"
],
"env": {
"ZOTERO_LIBRARY_ID": "your_user_id",
"ZOTERO_API_KEY": "your_api_key"
}
}
}
}
}- Proxy may affect data access, better not use proxy for this MCP server.
- Replace
your_user_idandyour_api_keywith your actual Zotero credentials - Use double backslashes
\\for Windows paths in JSON configuration - Keep your API key secure and consider using environment variables instead of hardcoding
For better security, you can configure the MCP server to use system environment variables:
{
"mcp": {
"servers": {
"MCP_ZoteroDB": {
"type": "stdio",
"command": "python",
"args": [
"E:\\<path-to-this-pkg>\\mcp_server_runner.py"
]
}
}
}
}Then set your credentials as system environment variables (as described in the Environment Variables section above).
Once configured, restart VS Code Copilot. You can then use the following MCP tools in your conversations:
fetch_literature- Search and retrieve papers from your Zotero librarycategorize_literature- Automatically categorize papers for literature reviewssearch_literature- Search your library with text queriesget_collections- List your Zotero collectionsget_tags- Get all tags from your libraryexport_for_llm- Export literature in LLM-optimized formats
After configuration, you can ask Copilot things like:
- "Search my Zotero library for papers about diffusion models"
- "Categorize my recent machine learning papers for a literature review"
- "Find papers by [author name] in my library"
- "Export papers about robotics in markdown format for my thesis"
The MCP server will automatically handle the requests and provide structured literature data that Copilot can use to help with your research and writing tasks.
You can test the MCP server functionality before integrating with Copilot:
# Test the MCP server directly
python test_mcp_client.py
# Run the MCP server manually
python mcp_server_runner.pySet environment variables for easier usage:
export ZOTERO_API_KEY="your_api_key"
export ZOTERO_LIBRARY_ID="your_user_id"
export ZOTERO_LIBRARY_TYPE="user" # or "group"- Automatically categorize papers by research themes
- Generate structured content for Related Works sections
- Extract key metadata and abstracts for analysis
- Provide structured literature context to LLM agents
- Enable agents to query and analyze your research library
- Automate literature review generation
- Analyze research trends across time periods
- Identify key authors and publication venues
- Track citation patterns and relationships
ZoteroAnalyzer: Main class for fetching and analyzing Zotero dataContentExporter: Handles exporting to various formatsFilterCriteria: Defines filtering parameters for literature searchLiteratureCategory: Represents a category for organizing literatureZoteroItem: Represents a single literature item with metadata
fetch_items(): Retrieve items with optional filteringcategorize_items(): Organize items into predefined categoriessearch_items(): Search library using text queriesexport_items(): Export items in JSON/Markdown formatsexport_for_llm_context(): Create LLM-optimized context files
We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
If you use ZoteroDB-Analyzer in your research, please cite:
@software{zoterodb_analyzer,
title={ZoteroDB-Analyzer: A Python Package for Literature Review Automation},
author={Raymon Yip},
year={2024},
url={https://github.com/MasterYip/ZoteroDB-Analyzer}
}- π Documentation: [Link to docs]
- π Bug Reports: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Contact: contact@zoterodb-analyzer.com
- Web interface for non-technical users
- Integration with additional reference managers
- Advanced citation network analysis
- Automated literature trend detection
- Support for full-text analysis
