This is a quick POC of generating a video from still images (Generative Images) and audio (text to speech api) with help of Gemini as the language model.
- Install ffmpeg
- Install the python requirements
pip install -r requirements.txt
- Gemini API
- Google Imagen2 Access
- Service Account with aiplatform.endpoints.predict permission
- .env file
DS_GOOGLE_API_KEY=your-gemini-api-key
DS_PROJECT_ID=your-project-id
DS_LOCATION=us-central1
- Export the GOOGLE_APPLICATION_CREDENTIALS
- Run below command
streamlit run app.py