MangaFusion 🎌

Transform your ideas into stunning manga pages with AI-powered storytelling and image generation.

Features

🎨 AI Story Planning

10-page outlines with panel hints, prompts, and dialogues
Character design and consistency
Visual style references

🖼️ Real Image Generation

Nano Banana (Gemini 2.5 Flash Image Preview) renders crisp B/W manga pages
Character consistency across pages
Style reference support

📖 Audiobook Mode

NEW! ElevenLabs Flash v2.5 TTS integration for dialogue narration
Reader mode with full-screen page viewing
Voice selection from available ElevenLabs voices
Audio generation for each page's dialogue with natural pauses
Usage tracking and character count monitoring
Keyboard navigation (← → arrows, Space/Enter for audio, Esc to exit)

⚡ Instant Publishing

Stream progress live during generation
Studio editor for refining pages with overlays
Real-time collaboration

Setup

Prerequisites

Node.js 18+
PostgreSQL 12+ (optional, for data persistence)
Supabase account (for image storage)
Google AI API key (for Gemini models)
ElevenLabs API key (for audiobook feature)

Installation

Clone and install dependencies:

git clone <your-repo>
cd mangafusion
npm install
cd backend && npm install

Configure environment variables:

# Copy example files
cp backend/.env.example backend/.env
cp .env.local.example .env.local

# Edit backend/.env with your API keys:
GEMINI_API_KEY=your-gemini-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key
SUPABASE_BUCKET=manga-images

Set up Supabase Storage:
- Create a new bucket named manga-images
- Make it public
- Set allowed MIME types: image/png, image/jpeg, image/webp, audio/mpeg

(Optional) Set up PostgreSQL Database:

# Create database
sudo -u postgres createdb mangafusion

# Add to backend/.env
DATABASE_URL="postgresql://user:password@localhost:5432/mangafusion"

# Run migrations
cd backend
npm run prisma:migrate:deploy
npm run prisma:seed  # Optional: add test data

📖 Full database setup guide: backend/DATABASE_SETUP.md 🔍 Quick reference: backend/PRISMA_QUICK_REFERENCE.md

Note: Without a database, episodes are stored in-memory and lost on restart.

Start development servers:

# Start both frontend and backend
./dev.sh

# Or manually:
# Terminal 1: Frontend
npm run dev

# Terminal 2: Backend
cd backend && npm run start:dev

API Keys Setup

Google AI (Gemini)

Go to Google AI Studio
Create an API key
Add to backend/.env as GEMINI_API_KEY

ElevenLabs (Audiobook)

Sign up at ElevenLabs
Get your API key from the profile page
Add to backend/.env as ELEVENLABS_API_KEY
Optionally set ELEVENLABS_DEFAULT_VOICE_ID (defaults to Adam voice)
Optionally set ELEVENLABS_MODEL (defaults to eleven_flash_v2_5 for speed and cost efficiency)

Supabase (Storage)

Create a project at Supabase
Go to Settings → API
Copy URL and anon key to backend/.env

Usage

Creating a Manga

Fill out the story form with title, genre, tone, setting, and characters
Optionally upload style reference images
Click "Generate Manga Episode"
Watch as AI plans and generates your 10-page manga

Audiobook Mode

Once pages are generated, click "Reader Mode"
Navigate with arrow keys or buttons
Click "Read Aloud" or press Space/Enter to generate audio for the current page
Audio combines all dialogue and narration for the page

Studio Editor

Click "Edit In Studio" from the episode page
Add text bubbles, images, and overlays
Use AI to regenerate pages with custom prompts
Insert dialogue suggestions from the planner

Architecture

Frontend (Next.js)

/pages/index.tsx - Main creation form
/pages/episodes/[id].tsx - Episode viewer with audiobook mode
/pages/studio/[id].tsx - Studio editor
/components/ - Reusable UI components

Backend (NestJS)

/src/planner/ - AI story planning with Gemini
/src/renderer/ - Image generation with Nano Banana
/src/tts/ - ElevenLabs text-to-speech integration
/src/episodes/ - Episode and page management
/src/storage/ - Supabase file storage
/src/prisma/ - Database ORM and persistence layer

Database (PostgreSQL + Prisma)

Episode: Stores manga episodes with seed input and outlines
Page: Tracks individual pages with status, images, and audio
Character: Maintains character reference images for consistency
In-memory fallback: Works without database, data lost on restart
Transaction support: Atomic operations for data integrity
Comprehensive indexes: Optimized for common queries

Environment Variables

Backend (`backend/.env`)

# Required
GEMINI_API_KEY=your-gemini-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key

# Optional - Database (for persistence)
DATABASE_URL=postgresql://user:password@localhost:5432/mangafusion

# Optional - AI Models
PLANNER_MODEL=gemini-2.5-flash
RENDERER_IMAGE_MODEL=gemini-2.5-flash-image-preview

# Optional - TTS
ELEVENLABS_DEFAULT_VOICE_ID=pNInz6obpgDQGcFmaJgB
ELEVENLABS_MODEL=eleven_flash_v2_5

# Optional - Storage & Server
SUPABASE_BUCKET=manga-images
PORT=4000

Frontend (`.env.local`)

NEXT_PUBLIC_API_BASE=http://localhost:4000/api

Audiobook Feature Details

The audiobook feature uses ElevenLabs TTS to convert manga dialogue into speech:

Dialogue Generation: The planner creates structured dialogue for each panel
Audio Synthesis: ElevenLabs converts dialogue to natural speech
Storage: Audio files are stored in Supabase alongside images
Playback: Built-in audio player with controls

Supported Dialogue Types

Dialogue: Character speech with character name prefix
Narration: Story narration without character attribution
Sound Effects: Onomatopoeia and environmental sounds

Reader Mode Controls

← →: Navigate between pages
Space/Enter: Generate and play audio for current page
Esc: Exit reader mode
Audio Player: Standard HTML5 controls for playback

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

License

MIT License - see LICENSE file for details.

Support

For issues and questions:

Check the GitHub issues
Review the environment setup
Ensure all API keys are configured correctly
Check Supabase bucket permissions

Built with ❤️ using Next.js, NestJS, Gemini AI, and ElevenLabs TTS.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.kiro/settings		.kiro/settings
.next		.next
.vscode		.vscode
backend		backend
components		components
landingpage		landingpage
lib		lib
pages		pages
plans		plans
public		public
styles		styles
tests		tests
.env.local.example		.env.local.example
.env.observability.example		.env.observability.example
.gitignore		.gitignore
ARCHITECTURE_VISUAL_SUMMARY.txt		ARCHITECTURE_VISUAL_SUMMARY.txt
CANVAS_INTEGRATION_ARCHITECTURE.txt		CANVAS_INTEGRATION_ARCHITECTURE.txt
OBSERVABILITY_IMPLEMENTATION_REPORT.txt		OBSERVABILITY_IMPLEMENTATION_REPORT.txt
QUEUE_FILES_SUMMARY.txt		QUEUE_FILES_SUMMARY.txt
README.md		README.md
agent.md		agent.md
dev.sh		dev.sh
implementation_status		implementation_status
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
observability-alerts.yml		observability-alerts.yml
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.js		postcss.config.js
sentry.client.config.ts		sentry.client.config.ts
sentry.edge.config.ts		sentry.edge.config.ts
sentry.server.config.ts		sentry.server.config.ts
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
vercel.json		vercel.json

improdead/mangafusion

Folders and files

Latest commit

History

Repository files navigation