This repository is a base for a project on Mixture-of-experts Multimodal Variational Autoencoding (MMVAE).
The objective is to engineer an AI architecture for diagonal integration of multiple modalities by learning a shared latent space that gives cohesive and realistic joint generation across modalities which naturally cannot be encountered jointly.
This model is designed for concurrent training on information across diverse modalities.
At the core of this model is a variational autoencoder that generates realistic human single-cell transcriptomes:
For zero-shot cross-generation across multiple modalities (e.g. species), modality-specific encoders and decoders are stacked at either end of the core VAE:
To help channels communicate with each other and synergize, adversarial feedback will be added to the encoder. This will ensure the core VAE encoder is not biased for one channel or the other:
Outputs from each channel will also be evaluated with a generative adversarial discriminator network to encourage realistic-looking distributions in context:
MMVAE is inspired by works on multi-modal MMVAE from Shi et. al (2019), adversarial integration from Kopp et. al. (2022), recognizing challenges highlighted in this review.