-
Notifications
You must be signed in to change notification settings - Fork 8
Home
regunathb edited this page Apr 11, 2013
·
2 revisions
Sift is a set of libraries for interpreting useful information from unstructured data. Sift employs techniques commonly found in Natural Language Processing like Stemming, Sentiment analysis, Word segmentation etc.
The Sift libraries are organized by projects and each provides a set of capabilities, for e.g:
- tagcloud - contains a library for generating tag clouds that may be written to image files or as JSON files
- runtime - provides a processing API that is inspired by Map Reduce but follows data structures similar to Twitter Storm.
- batch - provides a Trooper batch based execution container for running the 'runtime' and 'tagcloud' libraries.