Skip to content

Topic taxonomy completion with hierarchical discovery of novel topic clusters

License

Notifications You must be signed in to change notification settings

donalee/taxocom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TaxoCom: A Framework for Topic Taxonomy Completion

Overview

The overview of the TaxoCom framework which discovers the complete topic taxonomy by the recursive expansion of the given topic hierarchy. Starting from the root node, it performs (1) locally discriminative embedding and (2) novelty adaptive clustering, to selectively assign the terms (of each node) into one of the child nodes.

Run the codes

STEP 1. Install the python libraries / packages

  • python
  • numpy, scipy
  • spherecluster
  • sklearn 0.21 (for the compatibility with spherecluser)

STEP 2. Download the dataset

  • Download the datasets from the following links, then place them in ./data/nyt and ./data/arxiv, respectively.

STEP 3. Execute the TaxoCom framework

  • Run the codes by using the following commands
cd code
bash run_taxocom.sh <dataset-name> <seed-taxo-name>
  • For example, the downloaded nyt directory can be simply used by
bash run_taxocom.sh nyt seed_taxo

About

Topic taxonomy completion with hierarchical discovery of novel topic clusters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published