Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 805 Bytes

File metadata and controls

14 lines (11 loc) · 805 Bytes

Association-Patterns-family-history

##Technologies Used: Python, Pydev for Eclipse IDE, Chris Borgelt's tools.

##Dataset used: collection of transcribed medical reports available on the MTSample website (http://mtsamples.com/).

##Description: Embarking on the journey for learning the KDD process. Using a public dataset, implementing association pattern mining techniques. This includes data preprocessing – extracting sentences with family members names, removal of stop words, Data Mining - implementation of Apriori algorithm and FPGrowth algorithm (Chris Borgelt’s tools). Post processing of word associations – making ordered wordLists and lastly, analysis of the output wordLists against various factors.

##TODO:

  • Adding code blocks
  • Modify structure of the page.
  • Add JS elements.