Computational Notebooks for "MORPHӔUS: Generative AI for Morphology-Aware Profiling of Human Cancer"
Gregory J. Baker1,2,3,*,#, Edward Novikov1,4,*, Yu-An Chen1,2, Clemens B. Hug1, Zergham Ahmed1,4, Sebastián A. Cajas Ordóñez4, Siyu Huang4, Clarence Yapp1,5, Shannon Coy1,2,6, Hanspeter Pfister4, Artem Sokolov1,7, Peter K. Sorger1,2,3,#
1Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 2Ludwig Center for Cancer Research at Harvard, Harvard Medical School, Boston, MA 3Department of Systems Biology, Harvard Medical School, Boston, MA 4Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 5Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA
*Co-first Authors: G.J.B., E.N.
#Corresponding Authors: gregory_baker2@hms.harvard.edu (G.J.B.), peter_sorger@hms.harvard.edu (P.K.S.)
Alterations in tissue organization and morphology are critical biomarkers of disease progression and therapeutic response. While immunofluorescence images provide information on protein abundance and spatial distribution within tissues, segmentation-based analysis methods fail to extract morphological detail, suffer from signal spillover across cell boundaries, and rely on custom algorithms to infer spatial relationships among segmented cells. Here we introduce MORPHӔUS, a spatial biology framework that classifies multiplex histology images at the pixel-level across arbitrary length scales using generative modeling with variational autoencoders (VAEs). When applied to human colorectal cancer, MORPHӔUS identifies biologically meaningful cell states, morphologies, cell-cell interactions, and composite tissue structures with greater accuracy than segmentation-based approaches while avoiding the problem of signal spillover. This fully unsupervised method requires no ground truth annotations and is agnostic to the number and nature of immunomarkers used, making it broadly applicable to a wide range of bioimaging applications.
If not already installed, download conda
following the instructions provided here.
The Python code in this GitHub repository is organized in Jupyter Notebooks and used to generate figures shown in the paper. To run the code, first clone this repo onto your computer. Then download the required input data folder from the Sage Bionetworks Synpase data repository dedicated to the MORPHӔUS project into the src
folder of the cloned repo. This folder also contains the full images and image patches used in the paper. Change directories into the top level of the cloned repo and create and activate a dedicated Conda environment with the necessary Python libraries for running the code by entering the following commands:
cd <path/to/cloned/repo>
conda env create -f environment.yml
conda activate morphaeus-paper
Next, change directories to the src
folder and open the computational notebooks in JupyterLab with the following command:
jupyter lab
MORPHӔUS source code will be made freely-available upon the release of the paper and will be archived on GitHub and Zenodo.
Image files associated with this paper were first generated as part of the Human Tumor Atlas Network (HTAN) project and are available at the HTAN Data Portal. Input images required to run the source code found here is also freely-available at Sage Synapse
This work was supported by the Ludwig Cancer Research and the Ludwig Center at Harvard (P.K.S., S.S.) and by NIH NCI grants U54-CA225088, U2C-CA233280, and U2C-CA233262 (P.K.S., S.S.). S.S. is supported by the BWH President’s Scholars Award.
The Python code (i.e., Jupyter Notebooks) in this GitHub repository is archived on Zenodo at