An AI-powered startup analysis platform that automatically researches and evaluates startup ideas using real-time market data, competitor analysis, and social media sentiment. Built with a modern Python stack featuring LangGraph orchestration, Model Context Protocol (MCP) for tool integration, and a beautiful Gradio web interface.
- Multi-Step AI Workflow: Automated 5-step analysis pipeline using LangGraph
- Real-Time Market Research: Web search, competitor analysis, and financial data
- Social Media Intelligence: Reddit and Twitter sentiment analysis
- Structured Output: Pydantic models ensure data consistency and validation
- Multiple Interfaces: Web UI (Gradio) and Command Line Interface
- Report Generation: Export analysis in Text, PDF, and Word formats
- Modular Architecture: MCP-based tool servers for easy extension
The system uses a client-server architecture with MCP (Model Context Protocol):
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Main Agent β β MCP Client β β Tool Servers β
β (LangGraph) βββββΊβ (Orchestrator) βββββΊβ (Data Sources) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Web Interface β β CLI Interface β β External APIs β
β (Gradio) β β (Terminal) β β (SERP, etc.) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
- Google Gemini 2.5 Flash: Primary LLM for analysis and reasoning
- SERP API: Web search and news search functionality
- Polygon.io API: Financial market data and company information
- Reddit API (PRAW): Social media sentiment and trend analysis
- Twitter API v2: Social media sentiment and engagement metrics
- Model Context Protocol (MCP): Tool server communication
- LangChain: LLM orchestration and tool integration
- LangGraph: Workflow state management and execution
startup-idea-analyzer-agent/
βββ src/ # Core application logic
β βββ workflow.py # LangGraph workflow definition
β βββ models.py # Pydantic data models
β βββ prompts.py # LLM prompt templates
βββ server/ # MCP tool servers
β βββ serp_server.py # Web search server
β βββ market_data_server.py # Financial data server
β βββ social_trends_server.py # Social media analysis server
βββ main.py # CLI application entry point
βββ gradio_app.py # Web interface application
βββ pyproject.toml # Project configuration and dependencies
βββ uv.lock # Dependency lock file
βββ .python-version # Python version specification
βββ .gitignore # Git ignore rules
βββ README.md # This file
The main analysis pipeline consists of 5 sequential steps:
- Market Research: Web search for market size, trends, and demographics
- Competitor Analysis: Identify and analyze existing competitors
- Social Trends: Analyze Reddit and Twitter sentiment
- Viability Assessment: Score startup potential (1-10)
- Final Recommendations: Generate actionable insights
Structured Pydantic models for:
MarketAnalysis: Market size, growth, target audienceCompetitorInfo: Competitor details and positioningStartupAnalysis: Viability scoring and assessmentStartupIdea: Complete startup informationResearchState: Workflow state management
- Purpose: Web search and news search
- Tools:
search,search_news - API: SERP API (Google Search)
- Features: Location-based search, result filtering
- Purpose: Financial market analysis
- Tools:
get_market_size,get_growth_trends,get_competitor_financials - API: Polygon.io (optional)
- Features: Market cap analysis, growth trends, competitor financials
- Purpose: Social media sentiment analysis
- Tools:
analyze_trends,reddit_analysis,twitter_sentiment - APIs: Reddit API (PRAW), Twitter API v2 (optional)
- Features: Sentiment analysis, engagement metrics, trend identification
- Framework: Gradio
- Features:
- Real-time progress tracking
- Beautiful dark theme UI
- Multiple export formats (TXT, PDF, DOCX)
- Interactive results display
- Environment validation
- Features:
- Interactive terminal interface
- Pretty-printed results
- File export capability
- Environment validation
- Error handling and recovery
# Clone the repository
git clone <repository-url>
cd startup-idea-analyzer-agent
# Install dependencies using uv
uv sync
# Set up environment variables
cp .env.example .env
# Edit .env with your API keysCreate a .env file with:
# Required APIs
GOOGLE_API_KEY=your_gemini_api_key
SERP_API_KEY=your_serp_api_key
# Optional APIs (for enhanced features)
POLYGON_API_KEY=your_polygon_api_key
REDDIT_CLIENT_ID=your_reddit_client_id
REDDIT_CLIENT_SECRET=your_reddit_client_secret
TWITTER_BEARER_TOKEN=your_twitter_bearer_tokenuv run python gradio_app.pyOpen browser to http://localhost:7860
uv run python main.py# Use app.py for Hugging Face Spaces
uv run python app.pyThe system generates comprehensive reports including:
- Market size and growth projections
- Target audience demographics
- Market trends and barriers to entry
- Regulatory considerations
- Top 5 competitors with detailed profiles
- Business models and funding stages
- Key features and competitive advantages
- Pricing strategies and market positioning
- Reddit discussion analysis and sentiment
- Twitter engagement metrics
- Trending topics and public sentiment
- Community feedback and pain points
- Overall viability score (1-10)
- Market opportunity assessment
- Competitive advantages and challenges
- Monetization strategies
- Risk assessment and time to market
- Go/No-Go decision with reasoning
- Immediate next steps
- Key success factors
- Risk mitigation strategies
- Alternative pivot opportunities
GOOGLE_API_KEY: Required for LLM analysisSERP_API_KEY: Required for web searchPOLYGON_API_KEY: Optional for financial dataREDDIT_CLIENT_ID/SECRET: Optional for Reddit analysisTWITTER_BEARER_TOKEN: Optional for Twitter analysis
- Modify prompts in
src/prompts.py - Add new tools in server files
- Extend data models in
src/models.py - Customize workflow in
src/workflow.py - Update deployment configuration in
app.py
The system includes robust error handling:
- Graceful fallbacks when APIs are unavailable
- Partial analysis when some tools fail
- Detailed error logging and user feedback
- Automatic retry mechanisms for transient failures
- Typical Analysis Time: 2-5 minutes per startup idea
- Concurrent Users: Limited by API rate limits
- Data Sources: Real-time web search + cached financial data
- Scalability: MCP architecture allows horizontal scaling
- Python Version: 3.13+ (specified in .python-version)
- LangGraph - Workflow orchestration and state management
- LangChain - LLM integration and tool management
- Model Context Protocol (MCP) - Tool server communication protocol
- Google Gemini 2.5 Flash - Primary LLM for analysis and reasoning
- Pydantic - Data validation and serialization
- Structured Output - LLM response formatting
- Gradio - Web interface framework
- HTML/CSS - Custom styling and layout
- JavaScript - Interactive UI components
- SERP API - Web search and news search
- Polygon.io API - Financial market data (optional)
- Reddit API (PRAW) - Social media analysis (optional)
- Twitter API v2 (Tweepy) - Social media sentiment (optional)
- uv - Python package manager and project management
- Python 3.13+ - Programming language
- asyncio - Asynchronous programming
- logging - Application logging
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
For issues and questions:
- Check the troubleshooting section
- Review API documentation
- Open an issue on GitHub
- Check environment variable configuration
- Refer to
DEPLOYMENT_GUIDE.mdfor deployment help