Skip to content

Latest commit

 

History

History
31 lines (18 loc) · 883 Bytes

README.md

File metadata and controls

31 lines (18 loc) · 883 Bytes

How it works

  1. Configure HDFS input files path in config.properties (Only Parquet for now)

  2. Configure input text column in config.properties

  3. Configure HDFS output path in config.properties

  4. spark-submit --class org.opentools.extraction.ExtractTopics --master yarn --deploy-mode cluster ExtractTopics-1.0.jar

Compilation

mvn clean compile

Uber Jar

mvn compile assembly:single

References