This repository contains the code we wrote during Rock the JVM's Spark Streaming with Scala. Unless explicitly mentioned, the code in this repository is exactly what was caught on camera.
- install Docker
- either clone the repo or download as zip
- open with IntelliJ as an SBT project
- in a terminal window, navigate to the folder where you downloaded this repo and run
docker-compose up
to build and start the containers - we will need them to integrate various stuff e.g. Kafka with Spark
Clone this repository and checkout the start
tag by running the following in the repo folder:
git checkout start
Checkout the master branch:
git checkout master
The repository was built while recording the lectures. Prior to each lecture, I tagged each commit so you can easily go back to an earlier state of the repo!
The tags are as follows (most recent first):
7.2-science-spark
7.1-science-http-kafka
6.4-stateful-computation
6.3-watermarks
6.2-processing-time-windows
6.1-event-time-windows
5.4-sentiment-analysis
5.3-twitter-exercises
5.2-twitter-receiver
5.1-socket-source
4.5-cassandra
4.4-akka
4.3-jdbc
4.2-kafka-dstreams
4.1-kafka-structured-streaming
3.3-dstreams-window-functions
3.2-dstreams-transformations
3.1-dstreams
2.4-streaming-datasets
2.3-streaming-joins
2.2-streaming-aggregations
2.1-streaming-dataframes
1.2-spark-recap
1.1-scala-recap
When you watch a lecture, you can git checkout
the appropriate tag and the repo will go back to the exact code I had when I started the lecture.
If you have changes to suggest to this repo, either
- submit a GitHub issue
- tell me in the course Q/A forum
- submit a pull request!