[MICCAI'24] fTSPL: Enhancing Brain Analysis with fMRI-Text Synergistic Prompt Learning
Early Accepted by MICCAI 2024
Using functional Magnetic Resonance Imaging (fMRI) to construct the functional connectivity is a well-established paradigm for deep learning-based brain analysis. Recently, benefiting from the remarkable effectiveness and generalization brought by large-scale multi-modal pre-training data, Vision-Language (V-L) models have achieved excellent performance in numerous medical tasks. However, applying the pre-trained V-L model to brain analysis presents two significant challenges: (1) The lack of paired fMRI-text data; (2) The construction of functional connectivity from multi-modal data. To tackle these challenges, we propose a fMRI-Text Synergistic Prompt Learning (fTSPL) pipeline, which utilizes the pre-trained V-L model to enhance brain analysis for the first time. In fTSPL, we first propose an Activation-driven Brain-region Text Generation (ABTG) scheme that can automatically generate instance-level texts describing each fMRI, and then leverage the V-L model to learn multi-modal fMRI and text representations. We also propose a Prompt-boosted Multi-modal Functional Connectivity Construction (PMFCC) scheme by establishing the correlations between fMRI-text representations and brain-region embeddings. This scheme serves as a plug-and-play preliminary that can connect with various Graph Neural Networks (GNNs) for brain analysis. Experiments on ABIDE and HCP datasets demonstrate that our pipeline outperforms state-of-the-art methods on brain classification and prediction tasks. The code is available at https://github.com/CUHK-AIM-Group/fTSPL.