Example Spark Streaming Job with Amazon MSK & Protobuf on Databricks

This repository contains an example job which emulates writing/reading events to Kafka with Protobuf SerDe using Spark Streaming.

Quickstart
Local tests

Quickstart

Clone the repository (or open it in Intellij IDEA)
Generate Protobuf specs via:

sbt clean compile

In Intellij IDEA mark the target/scala-2.12/src_managed/main as generated sources root. Important: un-mark nested main/scalapb as generated sources root, otherwise you'll run into issues while compiling the project with Intellij.
Configure Python environment and Databricks CLI
Install and configure dbx:

pip install dbx
dbx configure --profile-name=<your-databricks-cli-profile-name>

Provide required properties in the .env file:

INSTANCE_PROFILE_NAME="your-instance-profile" # instance profile to access the MSK instance
DATABRICKS_CONFIG_PROFILE="your-databricks-cli-profile-name"
KAFKA_BOOTSTRAP_SERVERS_TO_SECRETS="" # Kafka Bootstrap Servers string

Create the secret scope:

make create-scope

Add the secrets:

make add-secrets

Create a new instance pool in your databricks environment with name dbx-pool.
To deploy and launch the job in dev mode (the job won't be created or updated, ephemeral job run will be used):

make dev-launch-generator
make dev-launch-processor

To deploy the jobs so they'll be reflected in the Jobs UI:

make jobs-deploy

Local tests

Local testing suite requires sbt and Docker, since we're using testcontainers to run Kafka environment for unit tests.

Please find test example in src/test/scala/net/renarde/dbx/demos/app/UnifiedAppTest.scala.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.dbx		.dbx
conf		conf
project		project
src		src
.gitignore		.gitignore
Makefile		Makefile
README.rst		README.rst
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example Spark Streaming Job with Amazon MSK & Protobuf on Databricks

Quickstart

Local tests

About

Releases

Packages

Languages

renardeinside/dbx-kafka-protobuf-example

Folders and files

Latest commit

History

Repository files navigation

Example Spark Streaming Job with Amazon MSK & Protobuf on Databricks

Quickstart

Local tests

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages