A copy of code used at Spark Summit for training
#Code Examples
This project is pretty much a bunch of examples for training.
Here are a number of examples of using nesting with RDD, DataFrames, and DataSets.
- src/main/scala/com/malaska/spark/training/nested/
Here is an example of building a custom partition
- /src/main/scala/com/malaska/spark/training/partitioning/
Windowing is both a complex topic and one that show cases the idea of the Big and the Small.
- /src/main/scala/com/malaska/spark/training/windowing/
Run program
CountingInAStreamExpBatchCounting localhost 9999 ./checkpoint
Send it messages
nc -lk 9999
hi there
hi bob
Run program
CountingInAStreamExpUpdateStateByKey localhost 9999 ./checkpoint
Send it messages
nc -lk 9999
hi there
hi bob
Run program
CountingInAStreamExpGroupBy localhost 9999
Send it messages
nc -lk 9999
hi there
hi bob
Run program
CountingInAStreamDatasetExpGroupBy localhost 9999
Send it messages
nc -lk 9999
bob,tim,FB,10,123
bob,tim,FB,11,124
bob,tim,FB,10,125
tim,cat,G,20,124
tim,cat,G,20,125
tim,cat,G,20,126
Run program
CountingInAStreamMapWithState localhost 9999
Send it messages
nc -lk 9999
hi there
hi bob