The fork has been made in the scope of an academic project "Machine Learning in Medical Image Processing"
Main contributions are complex augmentations and the MobileNet2.
The json files were adapted for training on Google Colab, since it was the platform we trained the models.
CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets.
Chest radiography is the most common imaging examination globally, critical for screening, diagnosis, and management of many life threatening diseases. Automated chest radiograph interpretation at the level of practicing radiologists could provide substantial benefit in many medical settings, from improved workflow prioritization and clinical decision support to large-scale screening and global population health initiatives. For progress in both development and validation of automated algorithms, we realized there was a need for a labeled dataset that (1) was large, (2) had strong reference standards, and (3) provided expert human performance metrics for comparison.
CheXpert uses a hidden test set for official evaluation of models. Teams submit their executable code on Codalab, which is then run on a test set that is not publicly readable. Such a setup preserves the integrity of the test results.
Here's a tutorial walking you through official evaluation of your model. Once your model has been evaluated officially, your scores will be added to the leaderboard.Please refer to the https://stanfordmlgroup.github.io/competitions/chexpert/
- If you want to train yourself from scratch, we provide training and test the footwork code. In addition, we provide complete training courses
- If you want to use our model in your method, we provide a best single network pre-training model, and you can get the network code in the code
- Data preparation
We gave you the example file, which is in the folder
config/train.csv
You can follow it and write its path toconfig/example.json
- If you want to train the model,please run the command. (We use 4 1080Ti for training, so larger than 4 gpus is recommended):
pip install -r requirements.txt
python Chexpert/bin/train.py Chexpert/config/example.json logdir --num_workers 8 --device_ids "0,1,2,3"
- If you want to test your model, please run the command:
cd logdir/
- Cuz we set "save_top_k": 3 in the
config/example.json
, so we may have got 3 models for ensemble here. So you should do as below:
cp best1.ckpt best.ckpt
python classification/bin/test.py
- If you want to plot the roc figure and get the AUC, please run the command
python classification/bin/roc.py plotname
- How about drink a cup of coffee?
you can run the command like this. Then you can have a cup of caffe.(log will be written down on the disk)
python Chexpert/bin/train.py Chexpert/config/example.json logdir --num_workers 8 --device_ids "0,1,2,3" --logtofile True &
- We provide one pre-trained model here:
config/pre_train.pth
we test it on 200 patients dataset, got the AUC as below:
Cardiomegaly | Edema | Consolidation | Atelectasis | Pleural_Effusion |
---|---|---|---|---|
0.8703 | 0.9436 | 0.9334 | 0.9029 | 0.9166 |
- You can train the model with pre-trained weights, run the command as below:
python Chexpert/bin/train.py Chexpert/config/example.json logdir --num_workers 8 --device_ids "0,1,2,3" --pre_train "Chexpert/config/pre_train.pth"
- Currently supported global_pool options in
/config/example.json
to plot heatmaps
global_pool | Support |
---|---|
MAX | Yes |
AVG | Yes |
EXP | Yes |
LSE | Yes |
LINEAR | Yes |
PCAM | Yes |
AVG_MAX | No |
AVG_MAX_LSE | No |
original | AVG (dev mAUC:0.895) | LSE (dev mAUC:0.896) | PCAM (dev mAUC:0.896) | |
Cardiomegaly | ||||
Atelectasis | ||||
Pleural Effusion | ||||
Consolidation |
- You can plot heatmaps using command as below:
python Chexpert/bin/heatmap.py logdir/best1.ckpt logdir/cfg.json CheXper_valid.txt logdir/heatmap_Cardiomegaly/ --device_ids '0' --prefix 'Cardiomegaly'
Where the
CheXper_valid.txt
contains lines of jpg path
About PCAM pooling
- PCAM Overview:
- If you think PCAM is a good way to generate heatmaps, you can cite our article like this:
@misc{ye2020weakly,
title={Weakly Supervised Lesion Localization With Probabilistic-CAM Pooling},
author={Wenwu Ye and Jin Yao and Hui Xue and Yi Li},
year={2020},
eprint={2005.14480},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- If you have any quesions, please post it on github issues or email at coolver@sina.com