Skip to content

svngoku/Tutorial-Codebase-Knowledge

 
 

Repository files navigation

Turns Codebase into Easy Tutorial with AI

License: MIT

Ever stared at a new codebase written by others feeling completely lost? This tutorial shows you how to build an AI agent that analyzes GitHub repositories and creates beginner-friendly tutorials explaining exactly how the code works.

This is a tutorial project of Pocket Flow, a 100-line LLM framework. It crawls GitHub repositories and build a knowledge base from the code. It analyzes entire codebases to identify core abstractions and how they interact, and transforms complex code into beginner-friendly tutorials with clear visualizations.

⭐ Example Results for Popular GitHub Repositories!

🤯 All these tutorials are generated entirely by AI by crawling the GitHub repo!

  • AutoGen Core - Build AI teams that talk, think, and solve problems together like coworkers!

  • Browser Use - Let AI surf the web for you, clicking buttons and filling forms like a digital assistant!

  • Celery - Supercharge your app with background tasks that run while you sleep!

  • Click - Turn Python functions into slick command-line tools with just a decorator!

  • Crawl4AI - Train your AI to extract exactly what matters from any website!

  • CrewAI - Assemble a dream team of AI specialists to tackle impossible problems!

  • DSPy - Build LLM apps like Lego blocks that optimize themselves!

  • FastAPI - Create APIs at lightning speed with automatic docs that clients will love!

  • Flask - Craft web apps with minimal code that scales from prototype to production!

  • LangGraph - Design AI agents as flowcharts where each step remembers what happened before!

  • LevelDB - Store data at warp speed with Google's engine that powers blockchains!

  • MCP Python SDK - Build powerful apps that communicate through an elegant protocol without sweating the details!

  • NumPy Core - Master the engine behind data science that makes Python as fast as C!

  • OpenManus - Build AI agents with digital brains that think, learn, and use tools just like humans do!

  • Pydantic Core - Validate data at rocket speed with just Python type hints!

  • Requests - Talk to the internet in Python with code so simple it feels like cheating!

  • SmolaAgents - Build tiny AI agents that punch way above their weight class!

🚀 Getting Started

Option 1: Using Docker (Recommended)

  1. Clone this repository

  2. Configure your environment variables in the .env file:

    # Copy the sample .env file
    cp .env.sample .env
    
    # Edit the .env file with your credentials
    # GEMINI_PROJECT_ID=your-project-id
    # GITHUB_TOKEN=your-github-token
  3. Run the application using Docker Compose:

    docker-compose up -d
  4. Access the Streamlit web interface at http://localhost:8501

Option 2: Manual Installation

  1. Clone this repository

  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up LLM in utils/call_llm.py by providing credentials (API key or project name). We highly recommend the latest models with thinking capabilities (Gemini Pro 2.5, Claude 3.7 with thinking, O1). You can verify if it is correctly set up by running:

    python utils/call_llm.py
  4. Run the Streamlit web interface:

    streamlit run app.py

    Or generate a complete codebase tutorial directly using the command line:

    python main.py https://github.com/username/repo --include "*.py" "*.js" --exclude "tests/*" --max-size 50000
    • repo_url - URL of the GitHub repository (required)
    • -n, --name - Project name (optional, derived from URL if omitted)
    • -t, --token - GitHub token (or set GITHUB_TOKEN environment variable)
    • -o, --output - Output directory (default: ./output)
    • -i, --include - Files to include (e.g., ".py" ".js")
    • -e, --exclude - Files to exclude (e.g., "tests/" "docs/")
    • -s, --max-size - Maximum file size in bytes (default: 100KB)

The application will crawl the repository, analyze the codebase structure, generate tutorial content, and save the output in the specified directory (default: ./output).

💡 Development Tutorial

  • I built using Agentic Coding, the fastest development paradigm, where humans simply design and agents code.

  • The secret weapon is Pocket Flow, a 100-line LLM framework that lets Agents (e.g., Cursor AI) build for you

  • Check out the Step-by-step YouTube development tutorial:



About

Turns Codebase into Easy Tutorial with AI

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.9%
  • Dockerfile 1.1%