GitHub - rhauch/debezium-proto: Fast, scalable, durable, and distributed data store for mobile apps and web apps

This is a prototype for a demonstration system that uses Kafka, stream processing, and multiple distributed services to store data for theoretical mobile applications. A summary of the initial version that uses Apache Samza for stream processing is here, while the code for that version is (available as an archive or navigable code).

The current state of the codebase is incomplete and replaces Samza (and the requirement for YARN) with the proposed and much lighter-weight Kakfa Streams, which has not yet been released and which has changed since work on this prototype has stopped (see https://github.com/rhauch/debezium).

Debezium

This prototype demonstrates an easy-to-use, fast, and durable distributed storage service for mobile application data. With Debezium, apps have to do less work coordinating changes to shared state, and that saves mobile app developers time and money and keeps mobile apps less complicated. Debezium uses an event streaming approach to record each incoming request in a durable, partitioned, and replicated log, and Debezium's share-nothing services asynchronously process these requests and produce output to other logs. This architecture ensures that no data is lost: if any of the services should fail, when they are restarted they will simply continue where the previous service left off.

Technology

Debezium uses Apache Kafka to read and write the durable, partitioned, and replicated logs. Kafka is very powerful and has features that are essential for Debezium:

All messages are persisted on disk in a very fast and efficient way.
Message topics are partitioned and replicated across multiple machines, with automatic failover. Kafka keeps track of how many replicas are in sync, and if that number falls below some threshold you can specify whether the broker becomes unavailable or to continue producing messages and risk losing data. The only thing shared amongst replicas is the network connection.
Messages written with the same key will always be sent to the same partition, and all messages within a single partition have total ordering.
Consumers read topic partitions and control where in the log they are. If needed, they can even re-read old messages.Consumers can use either queuing or pub-sub.

With this architecture, Kafka can handle tremendous amounts of information and be configured to never lose any data. And because consumers are in control of where in the log they read, more options are available to Debezium than with traditional AMQP or JMS messaging systems.

Future plans

No more work on this prototype will be done. Instead, it has been superseded by another prototype that is focusing not on data storage for mobile apps but Change Data Capture. The architecture is quite similar, and the success of this prototype's use of Kafka shows the value of using Kafka as a replicated, partitioned, and append-only transaction log while using stream processing services to easily consume the totally ordered events within the logs.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
debezium-server		debezium-server
debezium		debezium
distribution		distribution
old		old
support		support
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pom.xml		pom.xml
settings.xml		settings.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Debezium

Technology

Future plans

About

Releases

Packages

Contributors 2

Languages

License

rhauch/debezium-proto

Folders and files

Latest commit

History

Repository files navigation

Debezium

Technology

Future plans

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages