Predict Immune and Inflammatory Gene Signature Expression Directly from Histology Images

Predict 6 gene signatures associated with response to nivolumab and survival in advanced hepatocellular carcinoma (HCC) from Sangro, Bruno, et al.

6-Gene Interferon Gamma (Ayers, Mark, et al.)
Gajewski 13-Gene Inflammatory (Spranger, Stefani, Riyue Bao, and Thomas F. Gajewski)
Inflammatory (Sangro, Bruno, et al)
Interferon Gamma Biology (Ayers, Mark, et al.)
Ribas 10-Gene Interferon Gamma (Ayers, Mark, et al.)
T-cell Exhaustion (Ayers, Mark, et al.)

Hierarchical clustering was performed on the gene expression data to generate labels for Whole Slide Images (WSIs). The deep learning models were trained (60%) with 10-fold Monte-carlo cross validation (20%) and tested (20%) on the TCGA LIHC dataset. Our in-house dataset (from APHP Henri Mondor) was then used for external validation. Results using tumoral annotations (regions of interest drawn by our expert pathologist) are superior to those using all the tissue regions.

Of note, the discovery series was stained with hematein-eosin (H&E) while external validation series was stained with hematein-eosin-saffron (HES). Thus we tested stain unmixing (3 methods implemented: Macenko PCA or XU SNMF or a fixed HES vector) and saffron removal for external validation series. Color noralization (2 methods: Reinhard or Macenco PCA) was also tested for both discovery and validation series. Furthermore, on-the-fly basic geometric augmentation were also tested during the training.

3 Deep learning approaches:

Patch-based (original repo)
2 Multiple Instance Learning (MIL): CLAM and classic MIL (original repo)

Results

AUROC in the discovery series (TCGA-LIHC) with/without tumoral annotations:

_{Gene signature}	_{tumor annot}	_Patch-based		_{Classic MIL}		_CLAM
_{Gene signature}	_{tumor annot}	_{Best fold}	_{Mean ± sd}	_{Best fold}	_{Mean ± sd}	_{Best fold}	_{Mean ± sd}
_{6G Interferon Gamma}	_❌	_0.578	_{0.492 ± 0.065}	_0.690	_{0.576 ± 0.102}	_0.734	_{0.600 ± 0.080}
_{6G Interferon Gamma}	_✔️	_0.661	_{0.560 ± 0.067}	_0.758	_{0.630 ± 0.078}	_0.780	_{0.635 ± 0.097}
_{Gajewski 13G Inflammatory}	_❌	_0.780	_{0.666 ± 0.072}	_0.851	_{0.577 ± 0.179}	_0.824	_{0.632 ± 0.107}
_{Gajewski 13G Inflammatory}	_✔️	_0.809	_{0.688 ± 0.062}	_0.893	_{0.694 ± 0.125}	_0.914	_{0.728 ± 0.096}
_Inflammatory	_❌	_0.673	_{0.523 ± 0.079}	_0.717	_{0.539 ± 0.139}	_0.738	_{0.607 ± 0.090}
_Inflammatory	_✔️	_0.706	_{0.580 ± 0.077}	_0.806	_{0.641 ± 0.123}	_0.796	_{0.665 ± 0.081}
_{Interferon Gamma biology}	_❌	_0.700	_{0.541 ± 0.088}	_0.672	_{0.562 ± 0.117}	_0.759	_{0.622 ± 0.088}
_{Interferon Gamma biology}	_✔️	_0.783	_{0.561 ± 0.119}	_0.677	_{0.610 ± 0.051}	_0.822	_{0.674 ± 0.102}
_{Ribas 10G Inflammatory}	_❌	_0.672	_{0.583 ± 0.081}	_0.652	_{0.552 ± 0.083}	_0.758	_{0.627 ± 0.082}
_{Ribas 10G Inflammatory}	_✔️	_0.727	_{0.640 ± 0.074}	_0.726	_{0.618 ± 0.065}	_0.806	_{0.669 ± 0.067}
_{T cell exhaustion}	_❌	_0.661	_{0.490 ± 0.108}	_0.744	_{0.516 ± 0.123}	_0.627	_{0.555 ± 0.063}
_{T cell exhaustion}	_✔️	_0.661	_{0.543 ± 0.073}	_0.788	_{0.606 ± 0.086}	_0.788	_{0.577 ± 0.092}

AUROC (of best-fold model) in the external validation series (Mondor) with tumoral anotations:

_{Gene signature (with tumor annot ✔️)}	_Patch-based	_{Classic MIL}	_CLAM
_{6G Interferon Gamma}	_0.694	_0.745	_0.871
_{Gajewski 13G Inflammatory}	_0.657	_0.782	_0.810
_Inflammatory	_0.657	_0.816	_0.850
_{Interferon Gamma biology}	_0.755	_0.793	_0.823
_{Ribas 10G Inflammatory}	_0.605	_0.779	_0.810
_{T cell exhaustion}	_0.810	_0.868	_0.921

Visualization / exlainability:

Workflow

Part 1. Gene expression clustering

To generate labels for WSIs

Process TCGA FPKM data with gene_clust/codes/tcga_fpkm_processing.ipynb
Perform hierarchical clustering with gene_clust/codes/PlotHeatmapGeneSignature.R (to reproduce the heatmap). Or using Python with gene_clust/codes/tcga_fpkm_clustering.ipynb (to get the same clustering results)

All TCGA data used and clutering results are provided in gene_clust/data and gene_clust/results. Due to privacy issues, the data in Mondor series is not provided but commands for external validation are described in this tutorial.

Part 2. Deep learning

To classify WSIs

The patch based approach requires another conda environment compared the two MIL approaches. According to the original CLAM repository, there are two options for tessellation, either saving both coordinates and images, or only coordinates to economize storage space (especially for large dataset or multiple modified patch versions) and loading images on-the-fly during the feature extraction (so-called fp). Annotations should be coordinates at the highest mangification of the WSI. Simple annotations in TXT and hierarchical annotations (for example to exclude necrosis inside a tumor) in NPT can be accepted.

Patch based approach
- fp
  - Without annotations: tutorial_patch-based_fp
  - With annotations: tutorial_patch-based_fp_anno
- not fp
  - Without annotations: tutorial_patch-based
  - With annotations: tutorial_patch-based_anno
Classic MIL
- fp
  - Without annotations: tutorial_mil_fp
  - With annotations: tutorial_mil_fp_anno
- not fp
  - Without annotations: tutorial_mil
  - With annotations: tutorial_mil_anno
CLAM
- fp
  - Without annotations: tutorial_clam_fp
  - With annotations: tutorial_clam_fp_anno
- not fp
  - Without annotations: tutorial_clam
  - With annotations: tutorial_clam_anno
Other settings: tutorial, including stain unmixing (and saffron removal), color normalization or data augmentation.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
color_normalization		color_normalization
data		data
dataset_csv		dataset_csv
datasets		datasets
docs		docs
gene_clust		gene_clust
ksh_codes		ksh_codes
models		models
results		results
splits		splits
tutorials		tutorials
utils		utils
wsi_core		wsi_core
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
attention_map.py		attention_map.py
attention_map_fp.py		attention_map_fp.py
attention_score.py		attention_score.py
create_patches.py		create_patches.py
create_patches_fp.py		create_patches_fp.py
create_splits_seq.py		create_splits_seq.py
data_augm.py		data_augm.py
eval.py		eval.py
eval_customed_models.py		eval_customed_models.py
eval_customed_models_fp.py		eval_customed_models_fp.py
eval_customed_models_slide_aggregation.py		eval_customed_models_slide_aggregation.py
extract_features.py		extract_features.py
extract_features_fp.py		extract_features_fp.py
main.py		main.py
stitch_tumor.py		stitch_tumor.py
train_customed_models.py		train_customed_models.py
train_customed_models_fp.py		train_customed_models_fp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predict Immune and Inflammatory Gene Signature Expression Directly from Histology Images

Results

Workflow

Part 1. Gene expression clustering

Part 2. Deep learning

About

Uh oh!

Releases

Packages

Languages

License

susooo/Histo2GeneSignatures

Folders and files

Latest commit

History

Repository files navigation

Predict Immune and Inflammatory Gene Signature Expression Directly from Histology Images

Results

Workflow

Part 1. Gene expression clustering

Part 2. Deep learning

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages