A Feature Extraction and Selection Tool for Categorizing Text Documents
-
Read directory structured and csv formatted datasets
-
Directory to CSV dataset conversion
-
Support for subcategories
-
Feature Extraction including n-grams terms
-
Best Terms selection based on TF-IDF, Mutual Information, Information Gain, and other metrics
-
Extracted features can be saved in WEKA ARFF format.
-
A more detailed documentation is comming soon...