Skip to content

This Python project uses neural networks and genetic algorithms to design bioinsecticide compounds targeting specific proteins. By leveraging IC50 values from FASTA sequences, it generates and optimizes SMILES representations, enhancing the efficacy and specificity of bioinsecticides.

Notifications You must be signed in to change notification settings

RubenVG02/BioInsectiNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioInsectiNet: Neural Network and Genetic Algorithm Framework for Bioinsecticide Design

Overview

This project enables the discovery and design of novel bioinsecticides targeting specific proteins. It features tools for predicting toxicity, generating bioinsecticides, and obtaining 3D structures of the designed molecules. By leveraging neural networks for toxicity prediction and bioinsecticide generation, combined with genetic algorithms for design refinement, this project enhances the efficiency and specificity of bioinsecticide development.

Usage

Preparing Data

  1. FASTA Sequence: Ensure your target protein is in FASTA format (amino acid sequence).

  2. Neural Networks: You need two neural networks:

    • Toxicity Prediction: Use cnn_affinity.py to train or utilize the pre-trained model.
    • Bioinsecticide Generation: Use generate_rnn.py to train or utilize the pre-trained model.

    Alternatively, use the pre-trained models located in the definitive_models folder.

  3. Data: Use data from databases such as Chembl, PubChem, or the provided "insect.csv".

CNN Usage (Affinity Prediction)

To predict toxicity using the CNN model, run:

python check_affinity.py --model_path <path_to_model> --data_path <path_to_data> --target_path <path_to_target_protein>

The program will return the toxicity of the designed bioinsecticides using the 'calculate_affinity' function.

RNN Usage (Bioinsecticide Generation)

To generate bioinsecticides using the RNN model, run:

python pretrained_rnn.py --model_path <path_to_model> --data_path <path_to_data> --target_path <path_to_target_protein>

The program will return the designed bioinsecticides using the 'generate' function.

Combination of Models

For combining both models (generation and toxicity prediction), use:

python affinity_with_target_and_generator.py --model_path <path_to_model> --data_path <path_to_data> --target_path <path_to_target_protein> --toxicity_limit <toxicity_limit> --output_path <path_to_output>

The program will generate bioinsecticides and filter out those exceeding the specified toxicity limit. You can also specify a path to check generated molecules.

Genetic Algorithm

To use the genetic algorithm, run:

python genetic_algorithm.py --smiles_list <smiles_list> --csv_file <path_to_csv_file> --rnn_model <path_to_rnn_model> --model_path <path_to_model> --generations <number_of_generations> --output_path <path_to_output>

You can provide SMILES sequences directly, via a CSV file, or use an RNN model to guide the generation. The program will return the best SMILES sequence from the last generation.

Installation

To obtain the 3D structure of the designed bioinsecticides, run:

python 3d_repr.py --model_path <path_to_model> --data_path <path_to_data> --target_path <path_to_target_protein> --toxicity_limit <toxicity_limit> --output_path <path_to_output>

This will generate an SDF file containing the 3D structure of the bioinsecticides. Use PyMOL to convert the SDF file to other formats (e.g., PDB) using the pymol_3d.py script.

Installation

Clone the repository:

git clone https://github.com/RubenVG02/BioinsecticidesDiscovery.git

Or download the latest release:

wget https://github.com/RubenVG02/BioinsecticidesDiscovery/releases/latest

Ensure Python 3.7 or higher is installed. Install the required libraries using:

pip install -r requirements.txt

Authors

Features

  • Design of new bioinsecticides based on the target protein
  • Improving the structure of previously designed bioinsecticides based on the target protein
  • Predicting the toxicity of the designed bioinsecticides
  • Obtaining CSV files and screenshots of the results
  • Obtaining the 3D structure of the designed bioinsecticides in different formats (SFD, PDB, etc.)
  • Fast and easy to use

Future Improvements

  • Add more databases to the CNN
  • Add more databases to the RNN
  • More complexity to the GA

License

MIT

Acknowledgements

About

This Python project uses neural networks and genetic algorithms to design bioinsecticide compounds targeting specific proteins. By leveraging IC50 values from FASTA sequences, it generates and optimizes SMILES representations, enhancing the efficacy and specificity of bioinsecticides.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages