A sophisticated, multi-agent Agentic-AI system engineered for comprehensive research, rigorous analysis, and polished response generation for complex scholarly inquiries. This advanced system leverages LangGraph's orchestration capabilities to coordinate specialized AI agents in a meticulously designed sequential workflow, delivering thoroughly researched, authoritative answers with transparent reasoning processes and full visibility into each agent's decision-making path.
The system combines specialized research capabilities across multiple knowledge domains with advanced information synthesis and content refinement, ensuring responses that are not only comprehensive but also well-structured, contextually relevant, and tailored to each unique research query's specific requirements.
- Project Overview
- Features
- Technology Stack
- System Architecture
- File Structure
- Getting Started
- Usage
- How It Works
- Customization
- Future Enhancements
- Acknowledgments
The Deep Research Agentic-AI represents a cutting-edge research assistant meticulously architected using LangChain's modular framework, LangGraph's sophisticated agent orchestration, and LangSmith's comprehensive tracing capabilities. Powered by Groq's high-performance inference engines, this system harnesses multiple specialized large language models to deliver exceptionally thorough, nuanced responses to complex research inquiries across diverse knowledge domains.
The system operates through an intelligently coordinated workflow of purpose-built specialist agents:
-
Researcher Agent - Conducts exhaustive information gathering across authoritative sources (Wikipedia, arXiv, Google Scholar, Tavily) with contextual awareness to construct a comprehensive knowledge foundation tailored to each specific query's requirements and domain context.
-
Drafter Agent - Employs advanced synthesis algorithms to transform raw research into cohesive, structured content through a recursive feedback mechanism that critically evaluates information completeness, identifies knowledge gaps, and adaptively requests targeted additional research when necessary.
-
Enhancer Agent - Applies sophisticated linguistic refinement, factual validation, and structural optimization to transform draft content into publication-quality responses with meticulous attention to accuracy, clarity, coherence, and presentation excellence.
This specialized agent decomposition a significant advancement over monolithic single-agent approachesβenables superior performance through dedicated expertise at each stage of the research process. The architecture provides unparalleled transparency by exposing each agent's reasoning chains, information evaluation criteria, and decision pathways throughout the research lifecycle.
- Multi-Agent Workflow - Coordinated specialists for research, drafting, and enhancement working in concert to produce superior results
- Multiple Research Sources - Integrates knowledge from Wikipedia, Google Scholar, arXiv, and Tavily web search for comprehensive coverage
- Transparent Reasoning - Visible agent logs reveal the system's thinking process, sources consulted, and decision-making rationale
- Interactive Refinement - Generates targeted follow-up questions when more information is needed to address specific knowledge gaps
- State Management - Maintains conversation context with LangGraph's state management for coherent multi-turn interactions
- Streamlit Interface - User-friendly web application with real-time agent activity monitoring showing each step of the research process
- LangSmith Integration - Complete tracing and monitoring capabilities for debugging and performance optimization
-
Core Frameworks:
- LangChain - For agent construction and tool integration, providing a flexible framework for building complex AI systems
- LangGraph - For building the agent workflow and state management, enabling sophisticated agent coordination
- LangSmith - For monitoring the agents, tools and state management with comprehensive tracing capabilities
- Streamlit - For the web user interface with real-time visualization of agent activities
-
Language Models:
- Groq - Primary LLM provider for all agents, selected for its optimal balance of performance and cost
- Models Used:
meta-llama/llama-4-scout-17b-16e-instruct
- Optimized for research tasks with strong reasoning capabilitiesdeepseek-r1-distill-llama-70b
- Selected for content synthesis and drafting with excellent writing abilitiesllama-3.3-70b-versatile
- Chosen for final content enhancement with superior language quality
-
Research Tools:
- Wikipedia API - For general knowledge lookup providing contextual background information
- arXiv API - For scientific paper research accessing the latest academic publications
- Google Scholar - For academic citations and papers with comprehensive scholarly coverage
- Tavily - For web search capabilities delivering current and diverse information sources
-
Dependencies:
langchain
- Core agent and workflow frameworklangchain-groq
- Groq language model integrationlangchain-community
- Community-contributed tools and componentslangchain-core
- Essential LangChain interfaces and classeslangchain-tavily
- Tavily search integrationlangsmith
- Tracing and monitoring frameworklanggraph
- Agent orchestration and state managementtyping-extensions
- Enhanced type annotationspydantic
- Data validation and settings managementarxiv
- arXiv paper search and metadata retrievalgoogle-search-results
- Google Scholar integration via SerpAPIwikipedia
- Wikipedia knowledge accessstreamlit
- Interactive web interface
The system is built on a LangGraph workflow with three main components working together in a coordinated process:
Query β [Researcher] β [Drafter] β (Needs more info? Back to Researcher) β [Enhancer] β Final Answer
- LangGraph State Flow - Coordinates agent activities and communication through a well-defined state machine
- LangChain Tools - Integrates external knowledge sources through structured tools with consistent interfaces
- LangSmith Monitoring - Enables LangSmith tracing to track and monitor the workflow with all integrated tools and states
- Groq AI Models - Powers the system with specialized language models for different tasks, each optimized for its specific role
Deep-Research-Agentic-AI/
β
βββ main.py # π Entry point to run the graph/ CLI inference
βββ app.py # π± Streamlit application
β
βββ state/
β βββ context.py # π§ Shared state (TypedDict)
β
βββ graph/
β βββ graph_builder.py # π LangGraph builder (node flow)
β βββ nodes.py # π§© All LangGraph node functions
β
βββ agents/
β βββ researcher.py # π Research agent (LLM + tools)
β βββ drafter.py # βοΈ Drafting agent (LLM w/feedback loop)
β βββ enhancer.py # π§Ό Final enhancer/validator (LLM)
β
βββ tools/ # π§ Custom tools wrapped for LangChain
β βββ wikipedia.py # π Wikipedia knowledge tool
β βββ arxiv.py # π arXiv research paper tool
β βββ scholar.py # π Google Scholar citation tool
β βββ tavily.py # π Tavily web search tool
β
|
βββ requirements.txt # π¦ All dependencies
βββ .env # π API keys and secrets
βββ .env.example # π Example environment variables template
βββ .gitignore # π« Files to ignore in git
βββ README.md # π Documentation for the project
- VS Code IDE - Recommended for optimal development experience with Python and virtual environment integration
- Python 3.10+ - Required for compatibility with all dependencies
- LangSmith API key - For tracing and monitoring agent workflows
- Groq API key - For accessing the language models
- SERP API key - For Google Scholar integration
- Tavily API key - For web search capabilities
-
Clone the repository:
git clone https://github.com/yourusername/Deep-Research-Agentic-AI.git cd Deep-Research-Agentic-AI
-
Create and activate a virtual environment:
# For Windows python -m venv venv venv\Scripts\activate # For macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Create your environment variables file:
# Copy the example environment file cp .env.example .env # Edit the .env file with your API keys
-
Open the
.env
file and add your API keys here:GROQ_API_KEY="your_groq_api_key" TAVILY_API_KEY="your_tavily_api_key" SERP_API_KEY="your_google_scholar_api_key" LANGSMITH_API_KEY="your_langsmith_project_api_key" LANGSMITH_PROJECT="your_project_name" specify the project name
The system uses different AI models for different agents. You can customize these in the respective agent files:
researcher.py
- Usesmeta-llama/llama-4-scout-17b-16e-instruct
by default for optimal research capabilitiesdrafter.py
- Usesdeepseek-r1-distill-llama-70b
by default for high-quality content draftingenhancer.py
- Usesllama-3.3-70b-versatile
by default for superior content enhancement
Each agent can be configured with different parameters:
- Temperature settings - Adjust creativity vs determinism
- Context window size - Balance between history retention and speed
- Tool selection - Enable or disable specific research tools
- Response formats - Customize output structure for specific needs
Run the system from the command line:
python main.py
You'll be prompted to enter your research query. The system will then:
- Research your query across multiple sources with the Researcher agent
- Draft an initial answer based on research findings with the Drafter agent
- Enhance the draft into a polished, well-structured response with the Enhancer agent
The CLI interface shows real-time updates on each agent's progress and thinking process.
For a more user-friendly experience, launch the Streamlit app:
streamlit run app.py
This provides:
- Chat interface for queries and follow-ups with a clean message history
- Real-time agent logs showing the system's thought process and tool usage
- State visualization to track the flow of information between agents
- Ability to ask follow-up questions that build on previous context
- Options to customize research parameters for specific needs
Example queries to try:
- "What are the latest advances in large language model training techniques?"
- "How does quantum computing differ from classical computing?"
- "Summarize recent research on climate change mitigation strategies"
- Query Processing: Your query is analyzed to determine research requirements and knowledge domains
- Research Phase: The Researcher agent gathers information from Wikipedia, arXiv, Google Scholar, and web search
- Identifies key concepts and relationships in the query
- Selects appropriate research tools based on the query domain
- Retrieves relevant information from multiple sources
- Organizes findings into a structured research summary
- Draft Creation: The Drafter agent analyzes research context and creates a coherent draft
- Synthesizes information from all research sources
- Structures content logically with clear sections
- Identifies potential information gaps
- If information is insufficient, it will request specific follow-up questions
- Enhancement: The Enhancer agent improves the draft for clarity, structure, and accuracy
- Refines language for precision and readability
- Ensures logical flow and coherent structure
- Verifies factual accuracy against research findings
- Adds appropriate formatting and organization
- Final Answer: The system returns a comprehensive, well-formed response with clear attribution to sources
The system maintains state between interactions, allowing for follow-up questions that build on previous context and research history.
Add new tools to the tools/
directory following the pattern of existing tools:
-
Create a new file like
my_tool.py
-
Implement your tool using LangChain's tool framework:
from langchain.tools import BaseToolAPIWrapper from langchain_core.tools import Tool # --- Tool Setup --- # Check documentation of tools to configure the tool base_tool_api = BaseToolAPIWrapper( parameter1=x, parameter2=y, parameter3=z, ) def base_tool_search(query: str) -> str: return base_tool_api.run(query) myNewTool = Tool( # Check documentation of tools to configure the tool name="Base Tool Search", func=base_tool_search, description="Search the results from Base Tool" )
-
Import and add your tool to the list in
researcher.py
:from tools.my_tool import myNewTool # In the tools list tools = [wikipedia, arxiv, scholar, tavily, myNewTool, ] # Add your new tool here
- Citation tracking and verification - Implement a system to track and verify citations for academic credibility
- Academic paper summarization - Add specialized capabilities for condensing complex research papers
- PDF document analysis - Enable direct analysis of uploaded PDF documents
- Local document search - Add capability to search through user-provided document collections
- User preference memory - Store and apply user preferences for research style and detail level
- Multiple LLM provider support - Expand beyond Groq to support other LLM providers
- Cross-language research - Enhance capabilities to research in multiple languages
- Interactive visualization - Add data visualization tools for research findings
- Audio and video source analysis - Expand beyond text to include multi-modal research
- Collaborative research sessions - Enable multiple users to contribute to a shared research project
- Built with:
- Powered by Groq language models for high-performance inference
- Research tools: Wikipedia, arXiv, Google Scholar, and Tavily
- Inspired by latest research in Agentic-AI systems and collaborative intelligence