MIL_BASELINE

With the rapid advancement of computational power and artificial intelligence technologies, computational pathology has gradually been regarded as the most promising and transformative auxiliary diagnostic paradigm in the field of pathology diagnosis. However, due to the fact that pathological images often consist of hundreds of billions of pixels, traditional natural image analysis methods face significant computational and technical limitations. Multiple Instance Learning (MIL) is one of the most successful and widely adopted paradigms for addressing computational pathology analysis. However, current MIL methods often adopt different frameworks and structures, which poses challenges for subsequent research and reproducibility. We have developed the MIL_BASELINE library with the aim of providing a fundamental and simple template for Multiple Instance Learning applications.

News of MIL-Baseline

**Update Preview** 1.Comprehensive update of feature extraction and heatmap visualization system (integrated with Trident) 2.Comprehensive update for survival analysis adaptation

2025-11-08 Adapt the H5 format feature file for Trident (https://github.com/mahmoodlab/TRIDENT) and add the balanced_sampler plugin.

2025-1-10 fix bug of MIL_BASELINE, update visualization tools, add new MIL methods, add new dataset split methods

2024-11-24 update mil-finetuning (gate_ab_mil,ab_mil) for rrt_mil

2024-10-12 fix bug of Ctranspath feature encoder

2024-10-02 add FR_MIL Implement

2024-08-20 fix bug of early-stop

2024-07-27 fix bug of plip-transforms

2024-07-21 fix bug of DTFD-MIL fix bug of test_mil.py

2024-07-20 fix bug of all MIL-models expect DTFD-MIL

This project was originally developed for our previous work and is continuously maintained to be more user-friendly and support more approaches for histopathology WSI analysis.
If you find this codebase helpful in your research, please consider citing:

@inproceedings{ling2024agent,
  title        = {Agent Aggregator with Mask Denoise Mechanism for Histopathology Whole Slide Image Analysis},
  author       = {Ling, Xitong and Ouyang, Minxi and Wang, Yizhi and Chen, Xinrui and Yan, Renao and Chu, Hongbo
                  and Cheng, Junru and Guan, Tian and Tian, Sufang and Liu, Xiaoping and others},
  booktitle    = {Proceedings of the 32nd ACM International Conference on Multimedia},
  pages        = {2795--2803},
  year         = {2024}
}

📝 Overall Introduction

🔖 Library Introduction

A library that integrates different MIL methods into a unified framework
A library that integrates different Datasets into a unified Interface
A library that provides different Datasets-Split-Methods which commonly used
A library that easily extend by following a uniform definition

💡 Dataset Uniform Interface

User only need to provide the following csvs whether Public/Private Dataset
/datasets/example_Dataset.csv

🌂 Supported Dataset-Split-Method

User-difined Train-Val-Test split
User-difined Train-Val split
User-difined Train-Test split
Train-Val split with K-fold
Train-Val-Test split with K-fold
Train-Val with K-fold then test
The difference between the different splits is in /split_scripts/README.md

📐 Feature Encoder

R50 Deep Residual Learning for Image Recognition (CVPR 2016)
VIT-S An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ICLR 2021)
CTRANSPATH Transformer-Based Self-supervised Learning for Histopathological Image Classification (MIA 2023)
PLIP A visual–language foundation model for pathology image analysis using medical Twitter (NAT MED 2023)
CONCH A visual-language foundation model for computational pathology (NAT MED 2024)
UNI Towards a general-purpose foundation model for computational pathology (NAT MED 2024)
UNI-V2 Towards a general-purpose foundation model for computational pathology (NAT MED 2024)
GIGAPATH A whole-slide foundation model for digital pathology from real-world data (NAT 2024)
VIRCHOW A foundation model for clinical-grade computational pathology and rare cancers (NAT 2024)
VIRCHOW-V2 Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology (ARXIV 2024)
CONCH-V1.5 Multimodal Whole Slide Foundation Model for Pathology (NAT MED 2025)
HOPTIMUS-V0 HOPTIMUS-V0
HOPTIMUS-V1 HOPTIMUS-V1
MIDNIGHT MIDNIGHT
UPDATING...

💎 Implementated NetWork

MEAN_MIL
MAX_MIL
AB_MIL Attention-based Deep Multiple Instance Learning (ICML 2018)
MIXUP_MIL mixup: Beyond Empirical Risk Minimization (ICLR 2018)
DT_MIL Deformable Transformer for Multi-instance Learning on Histopathological Image (MICCAI 2021)
TRANS_MIL Transformer based Correlated Multiple Instance Learning for WSI Classification (NeurIPS 2021)
DS_MIL Dual-stream MIL Network for WSI Classification with Self-supervised Contrastive Learning (CVPR 2021)
CLAM_MIL Data Efficient and Weakly Supervised Computational Pathology on WSI (NAT BIOMED ENG 2021)
PGCN_MIL Context-Aware Survival Prediction using Patch-based Graph Convolutional Networks (MICCAI 2021)
REMIX_MIL A General and Efficient Framework for MIL based WSI Classification (MICCAI 2022)
S4_MIL Efficiently Modeling Long Sequences with Structured State Spaces (ICLR 2022)
DG_MIL Distribution Guided Multiple Instance Learning for Whole Slide Image Classification (MICCAI 2022)
DTFD_MIL Double-Tier Feature Distillation MIL for Histopathology WSI Classification (CVPR 2022)
ADD_MIL Additive MIL: Intrinsically Interpretable MIL for Pathology (NeurIPS 2022)
ILRA_MIL Exploring Low-rank Property in MIL for Whole Slide Image classification (ICLR 2023)
IIB_MIL Integrated instance-level and bag-level MIL with label disambiguation (MICCAI 2023)
IB_MIL Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images (CVPR 2023)
RANKMIX_MIL Data Augmentation for Classifying WSIs with Diverse Sizes (CVPR 2023)
MHIM_MIL MIL Framework with Masked Hard Instance Mining for WSI Classification (ICCV 2023)
WIKG_MIL Dynamic Graph Representation with Knowledge-aware Attention for WSI Analysis (CVPR 2024)
AMD_MIL Agent Aggregator with Mask Denoise Mechanism for Histopathology WSI Analysis (MM 2024)
FR_MIL Distribution Re-calibration based MIL with Transformer for WSI Classification (TMI 2024)
PSEBMIX_MIL Pseudo-Bag Mixup Augmentation for MIL Based Whole Slide Image Classification (TMI 2024)
LONG_MIL Scaling Long Contextual MIL for Histopathology WSI Analysis (NeurIPS 2024)
DGR_MIL Exploring Diverse Global Representation in MIL for WSI Classification (ECCV 2024)
CDP_MIL cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process (ECCV 2024)
CA_MIL Context-Aware Multiple Instance Learning for WSI Classification (ICLR 2024)
AC_MIL Attention-Challenging Multiple Instance Learning for WSI Classification (ECCV 2024)
MAMBA_MIL Enhancing Long Sequence Modeling with Sequence Reordering in CPath (MICCAI 2024)
SC_MIL Sparse Context-aware MIL for Predicting Cancer Survival Probability Distribution in WSI (MICCAI 2024)
NCIE_MIL Rethinking Decoupled MIL Framework for Histopathological Slide Classification (MIDL 2024)
RRT_MIL Towards Foundation Model-Level Performance in Computational Pathology (CVPR 2024)
PA_MIL Dynamic Policy-Driven Adaptive Multi-Instance Learning for WSI Classification (CVPR 2024)
MICRO_MIL Graph-Based MIL for Context-Aware Diagnosis with Microscopic Images (MICCAI 2025)
DYHG_MIL Dynamic Hypergraph Representation for Bone Metastasis Cancer Analysis (CMPB 2025)
MSM_MIL Multi-scan Mamba-based Multiple Instance Learning for WSI classification (KBS 2025)
MAMBA2D_MIL 2DMamba: Efficient State Space Model for Image Representation (CVPR 2025)
AEM_MIL Attention Entropy Maximization for MIL based WSI Classification (MICCAI 2025)
MICO_MIL Multiple Instance Learning with Context-Aware Clustering (MICCAI 2025)
TDA_MIL Top-Down Attention-based Multiple Instance Learning for Whole Slide Image Analysis (MICCAI 2025)
GDF_MIL Rethinking Multi-Instance Learning through Graph-Driven Fusion (AAAI 2026)
UPDATING...

☑️ Implementated Metrics

AUC: macro,micro,weighed (same when 2-classes)
F1,PRE,RECALL: macro,micro,weighed
ACC,BACC: BACC is macro-RECALL
KAPPA: linear,quadratic
Confusion_Mat

📙 Let's Begin Now

🔨 Code Framework

MIL_BASELINE is constructed by the following parts：

/configs: MIL_BASELINE defines MIL models through a YAML configuration file.
/modules: Defined the network architectures of different MIL models.
/process: Defined the training frameworks for different MIL models.
/feature_extractor: Supports different feature extractors.
/split_scripts: Supports different dataset split methods.
/vis_scripts: Visualization scripts for TSNE and Attention.
/datasets: User-Datasets path information.
/utils: Framework's utility scripts.
train_mil.py: Train Entry function of the framework.
test_mil.py: Test Entry function of the framework.

📁 Dataset Pre-Process

Feature Extracter

Supported formats include OpenSlide and SDPC formats. The following backbones are supported: R50, VIT-S, CTRANSPATH, PLIP, CONCH, UNI, GIGAPATH, VIRCHOW, VIRCHOW-V2 and CONCH-V1.5. Detailed usage instructions can be found in /feature_extractor/README.md.

Feature extraction is orthogonal to MIL training. Therefore, we also recommend using repositories such as PIANO or TRIDENT for your feature extraction work.

Dataset-Csv Construction

You should construct a csv-file like the format of /datasets/example_Dataset.csv

Dataset-Split Construction

You can use the dataset-split-scripts to perform different dataset-split, the detailed split method descriptions are in /split_scripts/README.md.

🔥 Train/Test MIL

Yaml Config

You can config the yaml-file in /configs. For example, /configs/AB_MIL.yaml, A detailed explanation has been written in /configs/AB_MIL.yaml.

Train & Test

Then, /train_mil.py will help you like this:

python train_mil.py --yaml_path /configs/AB_MIL.yaml

We also support dynamic parameter passing, and you can pass any parameters that exist in the /configs/AB_MIL.yaml file, for example:

python train_mil.py --yaml_path /configs/AB_MIL.yaml --options General.seed=2024 General.num_epochs=20 Model.in_dim=768

The /test_mil.py will help you test pretrained MIL models like this:

python test_mil.py --yaml_path /configs/AB_MIL.yaml --test_dataset_csv /your/test_csv/path --model_weight_path /your/model_weights/path --test_log_dir /your/test/log/dir

You should ensure the --test_dataset_csv contains the column of test_slide_path which contains the /path/to/your_pt.pt. If --test_dataset_csv also contains the 'test_slide_label' column, the metrics will be calculated and written to logs.

⛲ Visualization

You can easily visualize the dimensionality reduction map of the features from the trained MIL model and the distribution of attention scores (or importance scores) by /vis_scripts/draw_feature_map.py and /vis_scripts/draw_attention_map.py. We have implemented standardized global feature and attention score output interfaces for most models, making the above visualization scripts compatible with most MIL model in the library. The detailed usage instructions are in /vis_scripts/README.md.

🗃️ Tips

You can use MIL_BASELINE as a package, but you should rename the folder MIL_BASELINE-main to MIL_BASELINE.

🍻 Acknowledgement

Thanks to the following repositories for inspiring this repository

✨ Git Pull

Personal experience is limited, and code submissions are welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MIL_BASELINE

📝 Overall Introduction

🔖 Library Introduction

💡 Dataset Uniform Interface

🌂 Supported Dataset-Split-Method

📐 Feature Encoder

💎 Implementated NetWork

☑️ Implementated Metrics

📙 Let's Begin Now

🔨 Code Framework

📁 Dataset Pre-Process

Feature Extracter

Dataset-Csv Construction

Dataset-Split Construction

🔥 Train/Test MIL

Yaml Config

Train & Test

⛲ Visualization

🗃️ Tips

🍻 Acknowledgement

✨ Git Pull

About

Uh oh!

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 449 Commits
configs		configs
datasets		datasets
feature_extractor		feature_extractor
modules		modules
process		process
split_scripts		split_scripts
utils		utils
vis_scripts		vis_scripts
.gitignore		.gitignore
LICENSE		LICENSE
Logo.png		Logo.png
README.md		README.md
__init__.py		__init__.py
test_mil.py		test_mil.py
train_mil.py		train_mil.py

License

lingxitong/MIL_BASELINE

Folders and files

Latest commit

History

Repository files navigation

MIL_BASELINE

📝 Overall Introduction

🔖 Library Introduction

💡 Dataset Uniform Interface

🌂 Supported Dataset-Split-Method

📐 Feature Encoder

💎 Implementated NetWork

☑️ Implementated Metrics

📙 Let's Begin Now

🔨 Code Framework

📁 Dataset Pre-Process

Feature Extracter

Dataset-Csv Construction

Dataset-Split Construction

🔥 Train/Test MIL

Yaml Config

Train & Test

⛲ Visualization

🗃️ Tips

🍻 Acknowledgement

✨ Git Pull

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages