Convert static infographic images into editable Google Slides. Provide your infographic images, and the system automatically analyzes them using Vision-Language Models (VLMs) to extract text and visual elements, then reconstructs them as native, editable slide elements.
Click the image above to watch the demo video
- End-to-End Pipeline: Just provide images - VLM analysis, layout extraction, and slide creation are all automated
- Multi-Provider VLM Support: Google Gemini, OpenAI, Anthropic, and OpenRouter
- Multi-Slide Presentations: Convert multiple infographics into a single presentation
- Flexible Configuration: Configure via
.envfile, environment variables, or CLI arguments - Coordinate Transformation: Map pixel coordinates to slide points with aspect-ratio-preserving fit
- Slide Reconstruction: Create editable text boxes and placed images via Google Slides API
# 1. Install
cd images2slides
uv sync
# 2. Configure (copy and edit .env.example)
cp .env.example .env
# Edit .env with your API keys and credentials
# 3. Convert images to slides
uv run images2slides convert \
--image slide1.png \
--image slide2.png \
--title "My Presentation"Note: All CLI commands must be run with
uv runprefix (e.g.,uv run images2slides ...), which executes them within the project's virtual environment.
- Python 3.11+
- uv package manager
cd images2slides
uv syncAll configuration can be set via:
.envfile (recommended)- Environment variables
- CLI arguments (override .env and env vars)
cp .env.example .envEdit .env and set your VLM provider and API key:
# Choose your provider: google, openai, anthropic, or openrouter
VLM_PROVIDER=google
# Set the API key for your chosen provider
GOOGLE_API_KEY=your-api-key-here| Provider | Default Model | API Key Variable | Get API Key |
|---|---|---|---|
google (default) |
gemini-3-pro-preview |
GOOGLE_API_KEY |
Google AI Studio |
openai |
gpt-5.2 |
OPENAI_API_KEY |
OpenAI Platform |
anthropic |
claude-opus-4-5 |
ANTHROPIC_API_KEY |
Anthropic Console |
openrouter |
qwen/qwen3-vl-235b-a22b-instruct |
OPENROUTER_API_KEY |
OpenRouter |
To use a specific model, set VLM_MODEL in .env:
VLM_MODEL=gemini-3-pro-previewYou need to authenticate with Google to create presentations. Choose one of these methods:
This method opens a browser window for you to log in with your Google account. Best for local development and personal use.
How to get client_secret.json:
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Slides API:
- Go to "APIs & Services" > "Library"
- Search for "Google Slides API"
- Click "Enable"
- Create OAuth credentials:
- Go to "APIs & Services" > "Credentials"
- Click "Create Credentials" > "OAuth client ID"
- If prompted, configure the OAuth consent screen first (External is fine for personal use)
- Select "Desktop app" as application type
- Name it (e.g., "Slides Infographic")
- Click "Create"
- Download the credentials:
- Click the download button next to your new OAuth client
- Save the file as
secrets/client_secret.json
Configure in .env:
CLIENT_SECRET_PATH=secrets/client_secret.jsonOn first run, a browser window will open for you to authorize the app. The token is cached for future use.
This method uses a service account key file. Best for automated pipelines and server deployments. No browser interaction needed.
How to get service_account.json:
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Slides API (same as above)
- Create a service account:
- Go to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Name it (e.g., "slides-infographic")
- Click "Create and Continue"
- Skip the optional steps, click "Done"
- Create a key:
- Click on the service account you just created
- Go to "Keys" tab
- Click "Add Key" > "Create new key"
- Select "JSON" format
- Save the file as
secrets/service_account.json
Configure in .env:
SERVICE_ACCOUNT_PATH=secrets/service_account.jsonImportant: Service accounts create presentations in their own Drive. To access them:
- Share the presentation with your email after creation, OR
- Set up domain-wide delegation to impersonate users
| Use Case | Recommended Method |
|---|---|
| Local development | OAuth 2.0 |
| Personal scripts | OAuth 2.0 |
| CI/CD pipelines | Service Account |
| Server applications | Service Account |
| Shared team tool | Service Account with delegation |
All commands are run with uv run to execute within the virtual environment.
uv run images2slides convert \
--image infographic1.png \
--image infographic2.png \
--title "My Presentation"This command:
- Analyzes each image with the configured VLM
- Extracts text regions and image regions (icons, logos, charts)
- Optionally uploads cropped image regions to GCS
- Creates a new Google Slides presentation
- Builds editable slides with text boxes and images
If your infographics contain icons, logos, or charts that the VLM detects as image regions, you need a Google Cloud Storage bucket to host them:
uv run images2slides convert \
--image infographic.png \
--title "My Presentation" \
--gcs-bucket your-bucket-nameOr set GCS_BUCKET in your .env file. Without a GCS bucket, image regions will be skipped and you'll see a warning.
Output:
Step 1: Analyzing 1 image(s) with gemini-3-pro-preview...
[1/1] Analyzing: infographic.png
Found 5 text, 3 image regions
Step 2: Uploading 3 image region(s) to GCS...
[1/1] Cropping 3 regions from infographic.png
Step 3: Connecting to Google Slides API...
Step 4: Creating presentation 'My Presentation'...
============================================================
SUCCESS!
============================================================
Presentation URL: https://docs.google.com/presentation/d/abc123/edit
Presentation ID: abc123
Slides created: 1
uv run images2slides convert --help| Option | Description | Default |
|---|---|---|
--image |
Path to infographic image (repeatable) | Required |
--title |
Presentation title | "Infographic Presentation" |
--page-size |
Slide size: 16:9, 16:10, or 4:3 | 16:9 |
--provider |
VLM provider | From VLM_PROVIDER or "google" |
--model |
VLM model | From VLM_MODEL or provider default |
--gcs-bucket |
GCS bucket for image regions | From GCS_BUCKET |
--save-layouts |
Save layout JSON files to directory | - |
--client-secret |
OAuth client secret path | From CLIENT_SECRET_PATH |
--service-account |
Service account path | From SERVICE_ACCOUNT_PATH |
Extract layout JSON without creating slides:
uv run images2slides analyze \
--image infographic.png \
--output layouts/Creates infographic_layout.json with extracted text and bounding boxes.
# Validate a layout file
uv run images2slides validate --layout layout.json
# Post-process a layout (clean up whitespace, clamp bounds)
uv run images2slides postprocess --layout raw.json --output clean.json
# Build slides from pre-existing layout files
uv run images2slides create \
--layout slide1.json \
--layout slide2.json \
--title "My Deck"from images2slides.vlm import VLMConfig, extract_layout_from_image
from images2slides.auth import get_slides_service_oauth
from images2slides.build_slide import build_presentation, SlideInput
from images2slides.postprocess import postprocess_layout
# Configure VLM
config = VLMConfig(provider="google", model="gemini-3-pro-preview")
# Extract layouts from images
layouts = []
for image_path in ["slide1.png", "slide2.png"]:
layout = extract_layout_from_image(image_path, config)
layout = postprocess_layout(layout)
layouts.append(layout)
# Authenticate with Google Slides
service = get_slides_service_oauth("secrets/client_secret.json")
# Build presentation
slide_inputs = [SlideInput(layout=layout) for layout in layouts]
result = build_presentation(
service=service,
slides=slide_inputs,
title="My Presentation",
)
print(f"Created: {result.presentation_url}")All variables can be set in .env or as environment variables:
| Variable | Description |
|---|---|
VLM_PROVIDER |
VLM provider: google, openai, anthropic, openrouter |
VLM_MODEL |
Model name (optional, uses provider default) |
GOOGLE_API_KEY |
Google AI API key |
OPENAI_API_KEY |
OpenAI API key |
ANTHROPIC_API_KEY |
Anthropic API key |
OPENROUTER_API_KEY |
OpenRouter API key |
CLIENT_SECRET_PATH |
Path to OAuth client secret JSON |
SERVICE_ACCOUNT_PATH |
Path to service account JSON |
GCS_BUCKET |
GCS bucket for image uploads (optional) |
# VLM Configuration
VLM_PROVIDER=google
GOOGLE_API_KEY=AIza...
# Google Slides Authentication
CLIENT_SECRET_PATH=secrets/client_secret.jsonuv sync --devuv run pytestuv run ruff check images2slides cli
uv run black images2slides cliMake sure you have:
- Created a
.envfile from.env.example - Set the API key for your chosen provider
- The
.envfile is in the current directory or a parent directory
You need Google Slides API credentials. See Configure Google Slides API Access.
If running on a headless server, use a service account instead of OAuth.
Service accounts create files in their own Drive. Either:
- Check the presentation URL in the output and open it
- Share the presentation with your email
- Use OAuth instead for personal use
MIT