Unveiling Multi-level and Multi-modal Semantic Representations in the Human Brain using Large Language Models

Yuko Nakagi ¹, Takuya Matsuyama ¹, Naoko Koide-Majima, Hiroto Q. Yamaguchi, Rieko Kubo, Shinji Nishimoto ² and Yu Takagi ², EMNLP 2024 (¹ Equal first auther, ² Equal last auther)

[Paper] [Project Page] [Dataset]

This is a repository to reproduce the methods of our study on Multi-level and Multi-modal Semantic Representations in the Human Brain using Large Language Models that we published.

LLMs are now being used to study semantic representations in the human brain. However, most previous research has focused on single aspects such as speech or object content, overlooking the multi-level nature of semantic processing in real life.

To address this, we collected densely annotated fMRI datasets from participants watching over 8 hours of video. We then extracted latent representations from LLMs and multimodal LLMs. We built brain encoding models to predict brain activity from the latent representations.

We confirmed that the latent representations of different semantic content correspond to spatially distinct brain regions.

We also confirmed that Multimodal features correspond to brain activity better and more uniquely than the unimodal models.

Ensure that the necessary data is placed in ./data and set the path for drama2brain/data_const.py according to your environment.

Performing Ridge Regression with a Single Feature

python -m drama2brain.run_training_encoder --subject_name DM01\
                                            --time_delays 8 10\
                                            --modalities semantic\
                                            --modality_hparams speechTranscription_cw1_avgToken_ALL\
                                            --feat_names GPT2\
                                            --select_layers all\
                                            --device 0\
                                            --reg_type mono\
                                            --perm_type block\
                                            --reduce_dim default None\
                                            --dataset all

Below are explanations for some of the arguments. Full details are available in drama2brain/run_training_encoder.py:

Delay time (time_delays): 3-5 seconds
Modality-specific parameters (modality_hparams): speechTranscription_cw1_avgToken_ALL
- Expressed by concatenating various parameters used for feature creation with underscores (_)
  - Annotation type: speechTranscription
  - Time window: cw1
  - Token handling: avgToken
  - Annotator used: ALL
Model used for feature extraction (feat_names): GPT2

The following files will be primarily saved:

cv_cc_raw.npy: Prediction accuracy from cross-validation
cc_raw.npy: Prediction accuracy on test data
coef_raw.npy: Model weights
pvalues_corrected_raw.npy: p-values from statistical tests
dist_raw.png: Histogram of prediction accuracy on test data

(Note: The raw part of the file names may change depending on the dimensionality reduction method used.)

Performing Banded Ridge Regression with Multiple Features

python3 -m drama2brain.run_training_encoder --subject_name DM01\
                                            --time_delays 8 10\
                                            --modalities semantic semantic semantic semantic semantic\
                                            --modality_hparams speechTranscription_cw1_avgToken_ALL objectiveAnnot50chara_cw1_avgToken_ALL story_cw1_avgToken_ALL eventContent_cw1_avgToken_ALL timePlace_cw1_avgToken_ALL\
                                            --feat_names Llama2 Llama2 Llama2 Llama2 Llama2\
                                            --select_layers 32 13 17 20 2\
                                            --device 0\
                                            --reduce_dim pca 1280\
                                            --reg_type multi\
                                            --perm_type block\
                                            --n_iter 100\
                                            --dataset all

Full argument explanations are available in drama2brain/run_training_encoder.py.

Warning

The arguments modalities, modality_hparams, feat_names, and select_layers correspond to each other in parallel. In the above example, the first feature uses the conditions modalities=semantic, modality_hparams=speechTranscription_cw1_avgToken_ALL, feat_names=Llama2, and select_layers=32.
Currently, only one time_delays and reduce_dim can be selected. The same delay time and dimensionality reduction are applied to all features.

The following files will be saved:

cc_1280PCs.npy: Prediction accuracy on test data
coef_1280PCs.npy: Model weights
pvalues_corrected_1280PCs.npy: p-values from statistical tests
dist_1280PCs.png: Histogram of prediction accuracy on test data
preds_1280PCs.npy: Predicted brain activity on test data

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
drama2brain		drama2brain
images		images
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unveiling Multi-level and Multi-modal Semantic Representations in the Human Brain using Large Language Models

Performing Ridge Regression with a Single Feature

Performing Banded Ridge Regression with Multiple Features

About

Releases

Packages

Languages

yu-takagi/drama2brain

Folders and files

Latest commit

History

Repository files navigation

Unveiling Multi-level and Multi-modal Semantic Representations in the Human Brain using Large Language Models

Performing Ridge Regression with a Single Feature

Performing Banded Ridge Regression with Multiple Features

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages