This repository contains all Data, Models and Scripts accompanying the manuscript:
Predicting Blood-Brain Barrier Permeability of Marine-Derived Kinase Inhibitors Using Ensemble Classifiers Reveals Potential Leads for Neurodegenerative Disorders. Fabien Plisson and Andrew M. Piggott. Marine Drugs. January 2019.
-
Data:
- datasetsCompounds.xlsx contains all original 968 SMILES of CNS-penetrant small molecules, kinase drugs and marine-derived kinase inhibitors.
- datasetsDescrs.csv, datasetsNormalizedDescrs.csv and datasetsMorganFingerprints.csv contain either calculated 200 (normalized) physicochemical descriptors or Morgan fingerprints from all 968 chemical structures.
- logBBvalues.csv logBB values for 332 CNS-penetrant small molecules.
- similarity_matrix... Matrices of similarity measurements between all 968 structures using different fingerprints (Atom Pairs, MACCS Keys, Topological, Topological Torsions)
- mahanalobis_distance_modelset and mahanalobis_distance_holdoutset.csv contain all calculated Mahanalobis distances calculated for all 968 structures (model and holdout sets)
- predictions_modelset.csv and predictions_holdoutset.csv contain all predicted class membership (0 BBB-, 1 BBB+) and their probability estimates for all 968 structures (model and holdout sets) from our top 3 models (RFC, GBC, LOGREG).
-
Scripts:
- 1_Data_Preparation (Python 3.6)
- 2_Exploratory_Data_Analysis (R, Python 3.6)
- 3_Models... (Python 3.6)
- _4_Predictions_Holdoutset (Python 3.6)
- Models: Pickled files of our top 3 models (RFC, GBC, LOGREG).