Transform your ideas into stunning manga pages with AI-powered storytelling and image generation.
- 10-page outlines with panel hints, prompts, and dialogues
- Character design and consistency
- Visual style references
- Nano Banana (Gemini 2.5 Flash Image Preview) renders crisp B/W manga pages
- Character consistency across pages
- Style reference support
- NEW! ElevenLabs Flash v2.5 TTS integration for dialogue narration
- Reader mode with full-screen page viewing
- Voice selection from available ElevenLabs voices
- Audio generation for each page's dialogue with natural pauses
- Usage tracking and character count monitoring
- Keyboard navigation (β β arrows, Space/Enter for audio, Esc to exit)
- Stream progress live during generation
- Studio editor for refining pages with overlays
- Real-time collaboration
- Node.js 18+
- PostgreSQL 12+ (optional, for data persistence)
- Supabase account (for image storage)
- Google AI API key (for Gemini models)
- ElevenLabs API key (for audiobook feature)
- Clone and install dependencies:
git clone <your-repo>
cd mangafusion
npm install
cd backend && npm install- Configure environment variables:
# Copy example files
cp backend/.env.example backend/.env
cp .env.local.example .env.local
# Edit backend/.env with your API keys:
GEMINI_API_KEY=your-gemini-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key
SUPABASE_BUCKET=manga-images-
Set up Supabase Storage:
- Create a new bucket named
manga-images - Make it public
- Set allowed MIME types:
image/png,image/jpeg,image/webp,audio/mpeg
- Create a new bucket named
-
(Optional) Set up PostgreSQL Database:
# Create database sudo -u postgres createdb mangafusion # Add to backend/.env DATABASE_URL="postgresql://user:password@localhost:5432/mangafusion" # Run migrations cd backend npm run prisma:migrate:deploy npm run prisma:seed # Optional: add test data
π Full database setup guide: backend/DATABASE_SETUP.md π Quick reference: backend/PRISMA_QUICK_REFERENCE.md
Note: Without a database, episodes are stored in-memory and lost on restart.
-
Start development servers:
# Start both frontend and backend
./dev.sh
# Or manually:
# Terminal 1: Frontend
npm run dev
# Terminal 2: Backend
cd backend && npm run start:dev- Go to Google AI Studio
- Create an API key
- Add to
backend/.envasGEMINI_API_KEY
- Sign up at ElevenLabs
- Get your API key from the profile page
- Add to
backend/.envasELEVENLABS_API_KEY - Optionally set
ELEVENLABS_DEFAULT_VOICE_ID(defaults to Adam voice) - Optionally set
ELEVENLABS_MODEL(defaults toeleven_flash_v2_5for speed and cost efficiency)
- Create a project at Supabase
- Go to Settings β API
- Copy URL and anon key to
backend/.env
- Fill out the story form with title, genre, tone, setting, and characters
- Optionally upload style reference images
- Click "Generate Manga Episode"
- Watch as AI plans and generates your 10-page manga
- Once pages are generated, click "Reader Mode"
- Navigate with arrow keys or buttons
- Click "Read Aloud" or press Space/Enter to generate audio for the current page
- Audio combines all dialogue and narration for the page
- Click "Edit In Studio" from the episode page
- Add text bubbles, images, and overlays
- Use AI to regenerate pages with custom prompts
- Insert dialogue suggestions from the planner
/pages/index.tsx- Main creation form/pages/episodes/[id].tsx- Episode viewer with audiobook mode/pages/studio/[id].tsx- Studio editor/components/- Reusable UI components
/src/planner/- AI story planning with Gemini/src/renderer/- Image generation with Nano Banana/src/tts/- ElevenLabs text-to-speech integration/src/episodes/- Episode and page management/src/storage/- Supabase file storage/src/prisma/- Database ORM and persistence layer
- Episode: Stores manga episodes with seed input and outlines
- Page: Tracks individual pages with status, images, and audio
- Character: Maintains character reference images for consistency
- In-memory fallback: Works without database, data lost on restart
- Transaction support: Atomic operations for data integrity
- Comprehensive indexes: Optimized for common queries
# Required
GEMINI_API_KEY=your-gemini-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key
# Optional - Database (for persistence)
DATABASE_URL=postgresql://user:password@localhost:5432/mangafusion
# Optional - AI Models
PLANNER_MODEL=gemini-2.5-flash
RENDERER_IMAGE_MODEL=gemini-2.5-flash-image-preview
# Optional - TTS
ELEVENLABS_DEFAULT_VOICE_ID=pNInz6obpgDQGcFmaJgB
ELEVENLABS_MODEL=eleven_flash_v2_5
# Optional - Storage & Server
SUPABASE_BUCKET=manga-images
PORT=4000NEXT_PUBLIC_API_BASE=http://localhost:4000/apiThe audiobook feature uses ElevenLabs TTS to convert manga dialogue into speech:
- Dialogue Generation: The planner creates structured dialogue for each panel
- Audio Synthesis: ElevenLabs converts dialogue to natural speech
- Storage: Audio files are stored in Supabase alongside images
- Playback: Built-in audio player with controls
- Dialogue: Character speech with character name prefix
- Narration: Story narration without character attribution
- Sound Effects: Onomatopoeia and environmental sounds
- β β: Navigate between pages
- Space/Enter: Generate and play audio for current page
- Esc: Exit reader mode
- Audio Player: Standard HTML5 controls for playback
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
MIT License - see LICENSE file for details.
For issues and questions:
- Check the GitHub issues
- Review the environment setup
- Ensure all API keys are configured correctly
- Check Supabase bucket permissions
Built with β€οΈ using Next.js, NestJS, Gemini AI, and ElevenLabs TTS.