Explainable Deep Learning for Glaucomatous Visual Field Prediction: Artifact Correction Enhances Transformer Models
This repository is for the experimental notebooks for the paper "Explainable Deep Learning for Glaucomatous Visual Field Prediction: Artifact Correction Enhances Transformer Models". This research aims to develop and evaluate the accuracy and interpretability of deep learning models trained on optical coherence tomography (OCT) scans with artifact-removal preprocessing in predicting visual field (VF).
- Cross-sectional, retrospective datasets including reliable VF and OCT measurements obtained within six months.
- Training set: 1,674 VF-OCT pairs from 951 eyes
- Testing set: 429 VF-OCT pairs from 345 eyes
- Models trained: CNNs, Vision Transformer (ViT), and DINO-ViT
- Task: Estimate HFA 24-2 VF thresholds
- Input: Peripapillary retinal nerve fiber layer (RNFL) thickness maps
- Comparison: Models trained on original vs artifact-corrected datasets
- Evaluation metrics: Pointwise root mean square error (RMSE) and mean absolute error (MAE)
- Explainability techniques: GradCAM, GradCAM++, attention maps, uniform manifold approximation and projection, and principal component analysis
- Best performing model: DINO-ViT trained on artifact-corrected datasets
- RMSE = 4.44 dB
- MAE = 3.46 dB
- Improvement: Global RMSE and MAE reductions of 0.15 dB compared to performance on original maps
- Findings: Artifacts compromise DINO-ViT's predictive ability but improve with artifact correction
Transformer-based models enhance the accuracy and interpretability of visual function estimations from OCT scans, with RNFL artifact correction further refining these improvements.
You can see the experimental notebooks in
resnet34.ipynb
vgg16.ipynb
vit.ipynb
dino.ipynb
Please see requirements.txt
for dependencies. You can install the dependencies using pip
via
pip install -r requirements.txt