DeepVul

DeepVul is a model designed to predict gene essentiality and drug response using gene expression data. The model leverages a shared feature extractor to learn representations that can be fine-tuned for specific tasks such as gene essentiality prediction and drug response prediction.

Installation

To set up the environment, use the provided condaenv.yml file with conda. First, ensure you have conda installed, then run the following command:

conda env create --file condaenv.yml
conda activate condaenv

Datasets

To run the DeepVul model, you will need to download the following datasets and copy them into the data directory (with the names shown below):

Gene Expression: OmicsExpressionProteinCodingGenesTPMLogp1
Gene Essentiality: CRISPRGeneEffect.csv
Drug Response: primary-screen-replicate-collapsed-logfold-change.csv
Sanger Essentiality Data: gene_effect.csv
Somatic Mutation Data: CCLE_Oncomap3_Assays_2012-04-09.csv

After downloading these datasets, place them in the data directory to ensure the model can access them correctly.

Hyperparameter Usage and Possible Values

When running the DeepVul model, you can specify various hyperparameters to control its behavior. Below is a list of the hyperparameters along with their possible values:

--pretrain_batch_size: Batch size for pre-training data loading (default: 20)
--finetuning_batch_size: Batch size for fine-tuning data loading (default: 20)
--hidden_state: Hidden state size for the model (default: 500)
--pre_train_epochs: Number of epochs for pre-training (default: 20)
--fine_tune_epochs: Number of epochs for fine-tuning (default: 20)
--opt: Optimizer type (default: "Adam")
--lr: Learning rate for the optimizer (default: 0.0001)
--dropout: Dropout rate (default: 0.1)
--nhead: Number of heads in the multihead attention models (default: 2)
--num_layers: Number of layers in the model (default: 2)
--dim_feedforward: Dimension of the feedforward network (default: 2048)
--fine_tuning_mode: Mode for fine-tuning (default: "freeze-shared", options: ["freeze-shared", "initial-shared"])
--run_mode: Run mode (options: "pre-train", "fine-tune", "both")

Running the Model

First, change your current directory to src :

cd src

Pre-training

To run the pre-training process, use the following command:

python run_deepvul.py --pretrain_batch_size 20 --hidden_state 1000 --pre_train_epochs 20 --opt "Adam" --lr 0.0005 --dropout 0.2 --nhead 4 --num_layers 2 --dim_feedforward 1024 --run_mode pre-train

Fine-tuning

To run the fine-tuning process, use the following command:

python run_deepvul.py --finetuning_batch_size 20 --hidden_state 1000 --fine_tune_epochs 20 --opt "Adam" --lr 0.0005 --dropout 0.2 --nhead 4 --num_layers 2 --dim_feedforward 1024 --fine_tuning_mode "freeze-shared" --run_mode fine-tune

Running Both Pre-training and Fine-tuning

To run both pre-training and fine-tuning sequentially, use the following command:

python run_deepvul.py --pretrain_batch_size 20 --finetuning_batch_size 20 --hidden_state 1000 --pre_train_epochs 20 --fine_tune_epochs 20 --opt "Adam" --lr 0.0005 --dropout 0.2 --nhead 4 --num_layers 2 --dim_feedforward 1024 --fine_tuning_mode "freeze-shared" --run_mode both

Additional Information

For more details on the model and its implementation, please refer to the source code and associated documentation. If you encounter any issues or have questions, please open an issue or contact the maintainers.

Citation

@article {Jararweh2024.10.17.618944,
	author = {Jararweh, Ala and Arredondo, David and Macaulay, Oladimeji and Dicome, Mikaela and Tafoya, Luis and Hu, Yue and Virupakshappa, Kushal and Boland, Genevieve and Flaherty, Keith and Sahu, Avinash},
	title = {DeepVul: A Multi-Task Transformer Model for Joint Prediction of Gene Essentiality and Drug Response},
	elocation-id = {2024.10.17.618944},
	year = {2024},
	doi = {10.1101/2024.10.17.618944},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/10/21/2024.10.17.618944},
	eprint = {https://www.biorxiv.org/content/early/2024/10/21/2024.10.17.618944.full.pdf},
	journal = {bioRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Examples		Examples
data		data
src		src
README.md		README.md
condaenv.yml		condaenv.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepVul

Installation

Datasets

Hyperparameter Usage and Possible Values

Running the Model

Pre-training

Fine-tuning

Running Both Pre-training and Fine-tuning

Additional Information

Citation

About

Releases

Packages

Languages

alaaj27/DeepVul

Folders and files

Latest commit

History

Repository files navigation

DeepVul

Installation

Datasets

Hyperparameter Usage and Possible Values

Running the Model

Pre-training

Fine-tuning

Running Both Pre-training and Fine-tuning

Additional Information

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages