Skip to content

Latest commit

 

History

History
69 lines (51 loc) · 3.07 KB

README.md

File metadata and controls

69 lines (51 loc) · 3.07 KB

literAI

Colab

Demo: https://literai.hooloovoo.ai (source)

literAI is an experiment in open source AI composition written by emozilla. Originally inspired by scribepod by yacine, it creates a podcast where the two hosts, Alice and Bob, analyze a novel they both purportedly recently read, along with associated images generated from inferred descriptions of scenes in the novel. Cricually, literAI uses exclusively open source AI models (no API calls) and is designed to run on (admittedly high-end) consumer-grade hardware. It requires 24 GB of VRAM, although it is likely possible it could be tweaked to work with less.

Models used

Model Purpose
pszemraj/long-t5-tglobal-xl-16384-book-summary Generate summaries of the novel text
allenai/cosmo-xl Conversation generation
google/flan-t5-xl Scene description summarization from novel passages
dreamlike-art/dreamlike-diffusion-1.0 Image generation

Packages/tools used

Package Purpose
transformers Run LLMs
diffusers Run diffusion models
textsum Automate summary batching
LangChain LLM context and prompt construction
TorToiSe Audio generation
pydub Audio stiching

Running

To run, clone the repository and install neccessary requirements.

git clone https://github.com/jquesnelle/literAI
cd literAI
python -m pip install -r ./requirements.txt

Then, pass the novel's title, author, and path to the raw UTF-8 encoded text file to the literai module.

python -m literai "Alice's Adventures in Wonderland" "Lewis Carroll" alice-in-wonderland.txt

Note: this may take a while. A 24 GB CUDA-capable video card is highly recommended. The generated data will be in the output/ folder.

Running incrementally

Generating a literAI podcast is done in six steps, which the main literai command combines together. The steps are:

  1. Generate summaries
  2. Generate dialogue script
  3. Generate image descriptions
  4. Generate images
  5. Generate audio
  6. (optional) Add to index file and upload to Google Cloud Storage

Each of these steps can be invoked separately. For example, to re-create the dialogue script (it's random each time)

python -m literai.steps.step2 "Alice's Adventures in Wonderland" "Lewis Carroll"