To understand the philosophy behind this project please go throught the prerequisites.
Part-1 & Part-2 deal with basic filtering of data in accoradance to Computer Science domains and entering the data into HBase tables.
Part-3 deals with forming a clustered Community Network
from the filtered Citation Network
.
Part-4 deals with collecting and computing details about individual papers which have not been originally supplied to us. Eg. indegree, outdegree, median indegree etc.
Part-5 deals with computing various metircs on the Citation Network to know the strength of the different Computer Science communities.