- Scala (2.12.6)
- Python (3.9)
- Apache Kafka (2.8.0)
- Apache Spark (3.0.2)
- Mongo DB (4.4.3)
- MySQL (8.0..25)
- Start Kafka zookeeper and server
- Start Mysql and MongoDB
- Navigate to Python/src/producers folder and run kafka_producer.py to push messages to Apache Kafka topic
- Navigate to Scala/src/main/scala folder and run spark_stream_processing_app to start consuming from the Apacha Kafka topic
- After Spark app deployed to cluster it will start storing RAW JSON objects Dataframes to Mongo DB and aggregated data to MySQL table in batch mode