This project is a part of the course CS6024-'Algorithmic Approaches to Computational Biology'. It is done by Naga venkata sai kumar(CS18S003), Rahul Biswas(CS18S008) under the guidance of prof Manikandan Narayanan(http://maninarayanan.com/index.html).
Motivation: Helminths are multicellular organisms that develops a wide range of strategies to manipulate the host immune system. Immunity to helminths involves profound changes in both the innate and adaptive immune compartments,which can have a protective effect in inflammation and autoimmunity. Finding the features which are important in predicting the double disease cases exclusively can be used for manufacturing drug for union of the two diseases.
Result: We found a subset of features important for identifying the double disease cases exclusively.The subset of features are Urea (mg/dl),IL-5 (pg/ml),MCH (pg),MCV (fL),Eosinophils (%) and we have pruned the 50 feature set to a set of 5 features and found out the classification acuracy of double disease cases vs all the rest of the 3 classes and have obtained an accuracy upto 81.6%.
Good Result We found that accuracy we achieved with just 4 featues are equal to with all 50 featues
what is recall score?
Recall and Precision score
We got recall score around 0.29 to 0.35
precision score 0.71 to 0.75