Soundseeker - AI Experiment by UNIT9
This is an AI experiment originally released at https://www.royalcaribbean.com/soundseeker.
The user could upload 3 holiday images and get a bespoke holiday video generated for them, including a sound track composed to match their holiday visuals.
This has been achieved by developing a custom AI with Google TensorFlow, that was trained to corelate visual features (patterns, colours etc.) and semantic image contents (types of objects, people, locations) to musical features (BPM, key and more). This has been implement using three AI networks:
- A convolutional neural network looking at the image as a whole and learning the mapping to music.
- A deep neural network whose input was features extracted with Google Cloud Vision and represented as word embeddings to understand the semantic meaning, learning corelations between meaning of items in the image and music
- A merger neural network weighing the output of the two other networks and draing a final conclusion.
Training has been performed with tastemakers and crowdsourcing by mapping stock images to sample music.
In this experiment, you're given the core of the experience - the pre-trained AI, its source code and helper scripts.
To see a full case study, visit https://www.unit9.com/project/royal-caribbean-sound-seeker.
-
Instal python anaconda
-
Create environment:
Run conda env create --name soundseeker -f=environment.yml
- Activate new environment
Windows: activate soundseeker
Linux/OSX: source activate soundseeker
- Setup Google Cloud Vision Key
Windows: set CV_KEY=YOUR_KEY
Linux: export CV_KEY=YOUR_KEY
- Run it
python main.py --img-path path_to_image.jpg
- Explore the code