Infovore is an RDF processing system that uses Hadoop to process RDF data sets in the billion triple range and beyond. Infovore was originally designed to process the (old) proprietary Freebase dump into RDF, but once Freebase came out with an official RDF dump, Infovore gained the ability to clean and purify the dump, making it not just possible but easy to process Freebase data with triple stores such as Virtuoso 7.
Every week we run Infovore in Amazon Elastic/Map reduce in order to produce a product known as :BaseKB.
Infovore depends on the Centipede framework for packaging and processing command-line arguments. The Telepath project extends the Infovore project in order to process Wikipedia usage information to produce a product called :SubjectiveEye3D.
It costs several hundreds of dollars per month to process and store files in connection with this work. Please join Gittip and make a small weekly donation to keep this data free.
Infovore software requires JDK 7.
mvn clean install
The following cantrip, run from the top level "infovore" directory, initializes the bash shell for the use of the "haruhi" program, which can be used to run Infovore applications packaged in the Bakemono Jar.
source haruhi/target/path.sh
See
https://github.com/paulhoule/infovore/wiki
for documentation and join the discussion group at