Skip to content

Copy code from your codebase to clipboard instantly for LLM context!

License

Notifications You must be signed in to change notification settings

seatedro/glimpse

Repository files navigation

Glimpse

A blazingly fast tool for peeking at codebases. Perfect for loading your codebase into an LLM's context, with built-in token counting and code analysis.

Features

  • Fast parallel file processing
  • Tree-view of codebase structure
  • Source code content viewing
  • Token counting with multiple backends (tiktoken, HuggingFace)
  • Call graph generation for code analysis
  • Configurable defaults with global and per-repo config
  • Clipboard support
  • Customizable file type detection
  • Respects .gitignore automatically
  • Web content processing with Markdown conversion
  • Git repository support (GitHub, GitLab, Bitbucket, Azure DevOps)
  • URL traversal with configurable depth
  • XML output format for better LLM compatibility
  • Interactive file picker
  • PDF export

Installation

Using cargo:

cargo install glimpse

Using homebrew:

brew tap seatedro/glimpse
brew install glimpse

Using Nix:

# Install directly
nix profile install github:seatedro/glimpse

# Or use in your flake
{
  inputs.glimpse.url = "github:seatedro/glimpse";
}

Using an AUR helper:

# Using yay
yay -S glimpse

# Using paru
paru -S glimpse

Usage

Basic Usage

# Process a local directory
glimpse /path/to/project

# Process multiple files
glimpse file1 file2 file3

# Process a Git repository
glimpse https://github.com/username/repo.git

# Process a web page and convert to Markdown
glimpse https://example.com/docs

# Process a web page and its linked pages
glimpse https://example.com/docs --traverse-links --link-depth 2

On first use in a repository, Glimpse will save a .glimpse configuration file locally with your specified options. This file can be referenced on subsequent runs, or overridden by passing options again.

Common Options

# Show hidden files
glimpse -H /path/to/project

# Only show tree structure
glimpse -o tree /path/to/project

# Save output to GLIMPSE.md (default if no path given)
glimpse -f /path/to/project

# Save output to a specific file
glimpse -f output.txt /path/to/project

# Print output to stdout instead of copying to clipboard
glimpse -p /path/to/project

# Include specific file types (additive to source files)
glimpse -i "*.rs,*.go" /path/to/project

# Only include specific patterns (replaces default source detection)
glimpse --only-include "*.rs,*.go" /path/to/project

# Exclude patterns or files
glimpse -e "target/*,dist/*" /path/to/project

# Count tokens using tiktoken (OpenAI's tokenizer)
glimpse /path/to/project

# Use HuggingFace tokenizer with specific model
glimpse --tokenizer huggingface --model gpt2 /path/to/project

# Use custom local tokenizer file
glimpse --tokenizer huggingface --tokenizer-file /path/to/tokenizer.json /path/to/project

# Process a Git repository and save as PDF
glimpse https://github.com/username/repo.git --pdf output.pdf

# Open interactive file picker
glimpse --interactive /path/to/project

# Output in XML format for better LLM compatibility
glimpse -x /path/to/project

# Print the config file path and exit
glimpse --config_path

# Initialize a .glimpse config file in the current directory
glimpse --config

Code Analysis

Glimpse includes powerful code analysis features for understanding call relationships in your codebase.

Call Graph Generation

Generate call graphs to see what functions a target function calls (callees) or what calls it (callers):

# Generate call graph for a function (searches all files)
glimpse code :function_name

# Specify file and function
glimpse code src/main.rs:main

# Include callers (reverse call graph)
glimpse code src/main.rs:main --callers

# Limit traversal depth
glimpse code :process --depth 3

# Output to file
glimpse code :build -f callgraph.md

# Strict mode: only resolve via imports (no global name matching)
glimpse code :main --strict

# Precise mode: use LSP for type-aware resolution (slower but accurate)
glimpse code :main --precise

# Specify project root
glimpse code :main --root /path/to/project

Code Index Management

Glimpse maintains an index for faster code analysis. Manage it with:

# Build or update the index
glimpse index build

# Build with LSP for precise resolution
glimpse index build --precise

# Force rebuild (ignore existing index)
glimpse index build --force

# Clear the index
glimpse index clear

# Show index status and stats
glimpse index status

# Specify project path
glimpse index build /path/to/project

Runtime Dependencies

The code analysis features (glimpse code, glimpse index) require additional tools to be installed:

Tree-sitter Grammars

Glimpse automatically downloads and compiles tree-sitter grammars on first use. This requires:

  • git - to clone grammar repositories
  • C compiler (cc) - to compile parser.c
  • C++ compiler (c++) - to compile scanner.cc (some grammars)

On most systems these are available via:

  • macOS: xcode-select --install
  • Ubuntu/Debian: sudo apt install build-essential git
  • Fedora: sudo dnf install gcc gcc-c++ git
  • Arch: sudo pacman -S base-devel git

LSP Auto-Installation

When using --precise mode, Glimpse uses Language Server Protocol (LSP) servers for accurate type-aware resolution. Glimpse will attempt to auto-install missing LSP servers using the following priority:

  1. System PATH - Use existing LSP if already installed
  2. Cached binary - Use previously downloaded/installed LSP
  3. URL download - Download pre-built binaries (e.g., lua-language-server, rust-analyzer)
  4. Package managers - Install via npm/bun, go, or cargo if configured

For LSPs that don't provide pre-built binaries, auto-install requires the respective toolchain.

Glimpse supports LSP auto-install using cargo, npm/bun, go.

If auto-install fails, you'll see: LSP server '<name>' not found. Install it manually.

CLI Reference

Usage: glimpse [OPTIONS] [PATH]
       glimpse code [OPTIONS] <TARGET>
       glimpse index <COMMAND>

Arguments:
  [PATH]  Files, directories, or URLs to analyze [default: .]

Options:
      --config_path                Print the config file path and exit
      --config                     Init glimpse config file in current directory
      --interactive                Opens interactive file picker (? for help)
  -i, --include <PATTERNS>         Additional patterns to include (e.g. "*.rs,*.go")
      --only-include <PATTERNS>    Only include these patterns (replaces source detection)
  -e, --exclude <PATTERNS|PATHS>   Additional patterns or files to exclude
  -s, --max-size <BYTES>           Maximum file size in bytes
      --max-depth <DEPTH>          Maximum directory depth to traverse
  -o, --output <FORMAT>            Output format: tree, files, or both
  -f, --file [<PATH>]              Save output to specified file (default: GLIMPSE.md)
  -p, --print                      Print to stdout instead of copying to clipboard
  -t, --threads <COUNT>            Number of threads for parallel processing
  -H, --hidden                     Show hidden files and directories
      --no-ignore                  Don't respect .gitignore files
      --no-tokens                  Disable token counting
      --tokenizer <TYPE>           Tokenizer to use: tiktoken or huggingface
      --model <NAME>               Model name for HuggingFace tokenizer
      --tokenizer-file <PATH>      Path to local tokenizer file
      --traverse-links             Traverse links when processing URLs
      --link-depth <DEPTH>         Maximum depth to traverse links (default: 1)
      --pdf <PATH>                 Save output as PDF
  -x, --xml                        Output in XML format for better LLM compatibility
  -v, --verbose                    Verbosity level (-v, -vv, -vvv)
  -h, --help                       Print help
  -V, --version                    Print version

Code Subcommand:
  glimpse code <TARGET>            Generate call graph for a function
    <TARGET>                       Target in file:function or :function format
    --root <PATH>                  Project root directory [default: .]
    --callers                      Include callers (reverse call graph)
    --depth <N>                    Maximum depth to traverse
    -f, --file <PATH>              Output file (default: stdout)
    --strict                       Only resolve calls via imports
    --precise                      Use LSP for type-aware resolution

Index Subcommand:
  glimpse index build [PATH]       Build or update the index
    --force                        Force rebuild
    --precise                      Use LSP for precise resolution
  glimpse index clear [PATH]       Clear the index
  glimpse index status [PATH]      Show index status and stats

Configuration

Glimpse uses a config file located at:

  • Linux/macOS: ~/.config/glimpse/config.toml
  • Windows: %APPDATA%\glimpse\config.toml

Example configuration:

# General settings
max_size = 10485760  # 10MB
max_depth = 20
default_output_format = "both"

# Token counting settings
default_tokenizer = "tiktoken"       # Can be "tiktoken" or "huggingface"
default_tokenizer_model = "gpt2"     # Default model for HuggingFace tokenizer

# URL processing settings
traverse_links = false               # Whether to traverse links by default
default_link_depth = 1               # Default depth for link traversal

# Default exclude patterns
default_excludes = [
    "**/.git/**",
    "**/target/**",
    "**/node_modules/**"
]

XML Output Format

Glimpse supports XML output format designed for better compatibility with Large Language Models. When using the -x or --xml flag, the output is structured with clear XML tags that help LLMs better understand the context and structure of your codebase.

XML Structure

<context name="my_project">
<tree>
└── src/
  └── main.rs
</tree>

<files>
<file path="src/main.rs">
================================================
fn main() {
    println!("Hello, World!");
}
</file>
</files>

<summary>
Total files: 1
Total size: 45 bytes
</summary>
</context>

Benefits for LLM Usage

  • Clear context boundaries with the <context> wrapper
  • Structured sections for directory tree, file contents, and summary
  • Proper XML escaping
  • Automatic project name detection

Token Counting

Glimpse supports two tokenizer backends:

  1. Tiktoken (Default): OpenAI's tokenizer implementation, perfect for accurately estimating tokens for GPT models.

  2. HuggingFace Tokenizers: Supports any model from the HuggingFace hub or local tokenizer files, great for custom models or other ML frameworks.

The token count appears in both file content views and the final summary, helping you estimate context window usage for large language models.

Git Repository Support

Glimpse can directly process Git repositories from:

  • GitHub
  • GitLab
  • Bitbucket
  • Azure DevOps
  • Any Git repository URL (ending with .git)

The repository is cloned to a temporary directory, processed, and automatically cleaned up.

Web Content Processing

Glimpse can process web pages and convert them to Markdown:

  • Preserves heading structure
  • Converts links (both relative and absolute)
  • Handles code blocks and quotes
  • Supports nested lists
  • Processes images and tables

With link traversal enabled, Glimpse can also process linked pages up to a specified depth, making it perfect for documentation sites and wikis.

PDF Output

Any processed content (local files, Git repositories, or web pages) can be saved as a PDF with:

  • Preserved formatting
  • Syntax highlighting
  • Table of contents
  • Page numbers

Troubleshooting

  1. File too large: Adjust max_size in config
  2. Missing files: Check hidden flag and exclude patterns
  3. Performance issues: Try adjusting thread count with -t
  4. Tokenizer errors:
    • For HuggingFace models, ensure you have internet connection for downloading
    • For local tokenizer files, verify the file path and format
    • Try using the default tiktoken backend if issues persist

License

MIT

About

Copy code from your codebase to clipboard instantly for LLM context!

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 9

Languages