The codes consist of three parts:
1 DDFM Inference: To generate the initial intermediate modality. This part of code comes from Zhao Z, Bai H, Zhu Y, et al. DDFM: denoising diffusion model for multi-modality image fusion[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 8082-8093. https://github.com/Zhaozixiang1228/MMIF-DDFM
2 Distillation stage: BMG pre-trained by distilling DDFM as its counterpart.
3 Fine-tuning stage: BMG tuned together with the feature extractor to suit the counting task and bridge two original modalities.