This project is related to the talk: Processor API. It gathers a few code examples showing how the Streams DSL, from Kafka Streams, relies on a lower level api and why this api is exposed. While describing the library, this modules shows a few stream processing concepts.
Prerequisites:
git clone git@github.com:DivLoic/xke-stream-fighter.git
cd xke-stream-fighter
sbt dockerComposeUp
This will trigger a set of containers including: the confluent stack, a dataset generator and the streaming application example. Start a consumer to see the input stream:
docker-compose -p <id> exec registry kafka-avro-console-consumer --bootstrap-server kafka:9092 --topic ROUNDS
The output stream represent the append log of the aggregation and can be seen with the following command:
docker-compose -p <id> exec registry kafka-avro-console-consumer --bootstrap-server kafka:9092 --topic RESULTS-DSL
The interractives queries are dumped in a file inside of the stream app container.
Given the project name id
provided by the docker-compose plugin you can watch this file:
docker-compose -p <id> exec processors scripts/watch-interactive-queries.sh DSL
Despite the importance of the processor api in the kafka-streams library, their no mush resources (talk, examples, demos ...) about it. This modules were created to demonstrate most of his features. It's based on the confluent documentation.
The high level API brings the KStream
& KTable
abstractions.
It's simple, expressive and declarative. Here is a simple aggregation.
StreamsBuilder builder = new StreamsBuilder();
GlobalKTable<String, Arena> arenaTable = builder.globalTable(/* */ "ARENAS");
KStream<String, Round> rounds = builder.stream(/* */ "ROUNDS");
rounds
.filter((String arenaId, Round round) -> round.getGame() == StreetFighter)
.map((String arenaId, Round round) -> new KeyValue<>(arenaId, round.getWinner()))
.join(arenaTable, (arena, player) -> arena, Victory::new)
.selectKey(Parsing::extractConceptAndCharacter)
.groupByKey().windowedBy(window).count(/* */);
But this api won't let you access the states stores directly.
By implementing a processor you have access to a processor context, containing a lot of metadata and services.
public class ProcessPlayer implements Processor<String, Player> {
private ProcessorContext context;
private KeyValueStore<String, Arena> arenaStore;
@Override
public void init(ProcessorContext context) {
this.context = context;
this.arenaStore = (KeyValueStore) context.getStateStore("ARENA-STORE");
}
@Override
public void process(String key, Player value) {
Optional<Arena> mayBeArena = Optional.ofNullable(this.arenaStore.get(key));
mayBeArena.ifPresent(arena -> {
Victory victory = new Victory(value, arena);
GenericRecord victoryKey = groupedDataKey(victory);
context.forward(victoryKey, victory);
}
);
}
}
Finally, the best part of using the Processor API appears when we combine
it with the Stream DSL high level API. The .transform()
method allow us to
use Processor within a Kstream.
StreamsBuilder builder = new StreamsBuilder();
KStream<String, Round> rounds = builder
.stream("ROUNDS", Consumed.with(Serdes.String(), roundSerde, new EventTimeExtractor(), LATEST));
rounds
.filter((arenaId, round) -> round.getGame() == StreetFighter)
.filter((arenaId, round) -> round.getWinner().getCombo() >= 5)
.filter((arenaId, round) -> round.getWinner().getLife() >= 75)
.through("ONE-PARTITION-WINNER-TOPIC")
.transform(ProcessToken::new, "TOKEN-STORE")
.to("TOKEN-PROVIDED");