This repository provides a Python implementation of several transcriptomic signatures that were associated with immunotherapy response in the literature, for different cancer types and checkpoint inhibitors.
It contains the code used in our study to perform a benchmark of transcriptomic signatures to predict immunotherapy outcome in non-small cell lung cancer:
"Integration of clinical, pathological, radiological, and transcriptomic data improves the prediction of first-line immunotherapy outcome in metastatic non-small cell lung cancer"
Preprint: https://doi.org/10.1101/2024.06.27.24309583
Note: The transcriptomic signatures were selected based on the work of Kang et al. 2023.
- gseapy (=1.1.3)
- pandas (= 1.5.3)
- pyyaml (>= 6.0)
- scikit-learn (>= 1.2.0)
Optional (to run the scripts):
- scikit-survival (>= 0.21.0)
- tqdm (>= 4.63.0)
- xgboost (>= 1.7.5)
Clone the repository:
git clone https://github.com/sysbio-curie/tipit_benchmark_RNA
Define and compute a transcriptomic signature
import pandas as pd
from benchmark_RNA.signatures import get_CYT_score
data = pd.read_csv("data/transcritpomic_data.csv", index_col=0)
#1. Define the function score
CYT_fun = get_CYT_score(data)
#2. Compute the scores
CYT_scores = data.agg(CYT_fun, axis=1)
Note: data should be a pandas DataFrame with samples in rows and genes in columns. Columns names should be gene symbols.
Some signatures include training pre-processing steps in their definition such as PCA (e.g., FTBRS, TME), scaling (e.g., TIS, IIS), or KNN (MFP). It may be required to define them and compute their values with different datasets.
Define and compute a transcriptomic signature with train and test data
import pandas as pd
from benchmark_RNA.signatures_gsea import get_MFP_score
data_train = pd.read_csv("data/transcritpomic_data_train.csv", index_col=0)
data_test = pd.read_csv("data/transcritpomic_data_test.csv", index_col=0)
#1. Define the function score
MFP_fun = get_MFP_score(data_train)
#2. Compute the scores
MFP_scores = data_test.agg(MFP_fun, axis=1)
We provide a Python script to reproduce the benchmark of transcriptomic signatures for the prediction of immunotherapy outcome in lung cancer in our paper. It defines and tests the different signatures across the fold of a repeated cross-validation scheme.
python extract_signatures.py -c config.yaml
Name | Signature type | Cancer type | Immmune Checkpoint | References |
---|---|---|---|---|
CRMA | Marker genes | Melanoma | CTLA-4 | Shukla et al. |
CTLA4 | Marker genes | Multiple | PD-L1 | Herbst et al. |
CX3CL1 | Marker genes | Multiple | PD-L1 | Herbst et al. |
CXCL9 | Marker genes | Melanoma | PD-L1 | Qu et al. |
CYT | Marker genes | Multiple | PD-1, CTLA-4 | Rooney et al. |
EIGS | Marker genes | Multiple | PD-1 | Ayers et al. |
ESCS | Marker genes | Urothelial cancer | PD-1 | Wang et al. |
FTBRS | Marker genes | Multiple | PD-L1 | Mariathasan et al. |
HLADRA | Marker genes | Melanoma | PD-1, PD-L1 | Johnson et al. |
HRH1 | Marker genes | Multiple | PD-1, PD-L1, CTLA-4 | Li et al. |
IFNgamma | Marker genes | Multiple | PD-1 | Ayers et al. |
Immunopheno | Marker genes | Multiple | PD-1, CTLA-4 | Charoentong et al. |
IMPRES | Marker genes | Melanoma | PD-1, CTLA-4 | Auslander et al. |
IRG | Marker genes | Cervical cancer | PD-1, PD-L1, CTLA-4 | Yang et al. |
MPS | Marker genes | Melanoma | PD-1, CTLA-4 | Pérez-Guijarro et al. |
PD1 | Marker genes | Multiple | PD-1 | Taube et al. |
PDL1 | Marker genes | Multiple | PD-1, PD-L1 | Herbst et al. |
PDL2 | Marker genes | Multiple | PD-1 | Yearley et al. |
Renal101 | Marker genes | Renal cell carcinoma | PD-1, PD-L1 | Motzer et al. |
TIG | Marker genes | Multiple | PD-1 | Cristescu et al. |
TLS | Marker genes | Melanoma | PD-1, CTLA-4 | Cabrita et al. |
TME | Marker genes | Gastric cancer | PD-1, PD-L1, CTLA-4 | Zeng et al. |
APM | GSEA | Renal cell carcinoma | PD-1 | Senbabaoglu et al. |
CECMdown | GSEA | Multiple | PD-1 | Chakravarthy et al. |
CECMup | GSEA | Multiple | PD-1 | Chakravarthy et al. |
IIS | GSEA | Renal cell carcinoma | PD-1 | Senbabaoglu et al. |
IMS | GSEA | Gastric cancer | PD-1, PD-L1 | Lin et al. |
IPRES | GSEA | Multiple | PD-1 | Hugo et al. |
MFP | GSEA | Multiple | PD-1, PD-L1, CTLA-4 | Bagaev et al. |
MIAS | GSEA | Melanoma | PD-1 | Wu et al. |
PASSPRE | GSEA | Melanoma | PD-1 | Du et al. |
TIS | GSEA | Renal cell carcinoma | PD-1 | Senbabaoglu et al. |
CD8T_CIBERSORT | Deconvolution | Multiple | PD-1 | Tumeh et al. |
CD8T_MCPcounter | Deconvolution | Multiple | PD-1 | Tumeh et al. |
CD8T_Xcell | Deconvolution | Multiple | PD-1 | Tumeh et al. |
Immuno_CIBERSORT | Deconvolution | Melanoma | PD-1 | Nie et al. |
This repository was created as part of the PhD project of Nicolas Captier in the Computational Systems Biology of Cancer group and the Laboratory of Translational Imaging in Oncology (LITO) of Institut Curie.