Skip to content

Commit

Permalink
Added documentation on how the provider works and use cases (#121)
Browse files Browse the repository at this point in the history
  • Loading branch information
masesdevelopers authored Oct 20, 2023
1 parent aac6e22 commit b9cfcbd
Show file tree
Hide file tree
Showing 5 changed files with 171 additions and 6 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,11 @@ This project adheres to the Contributor [Covenant code of conduct](CODE_OF_CONDU
## Summary

* [Getting started](src/documentation/articles/gettingstarted.md)
* [How works](src/documentation/articles/howitworks.md)
* [Usage](src/documentation/articles/usage.md)
* [Use cases](src/documentation/articles/usecases.md)
* [Serialization](src/documentation/articles/serialization.md)
* [Templates usage](src/documentation/articles/usageTemplates.md)
* [Serialization](src/documentation/articles/serialization.md)
* [External application](src/documentation/articles/externalapplication.md)
* [Roadmap](src/documentation/articles/roadmap.md)
* [Current state](src/documentation/articles/currentstate.md)
Expand Down
93 changes: 93 additions & 0 deletions src/documentation/articles/howitworks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# KEFCore: how it works

[Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) can be used in some operative conditions.

However it is important to start with a simple description on how it works.

## Basic concepts

Here below an image from Wikipedia describing simple concepts:

![Alt text](https://upload.wikimedia.org/wikipedia/commons/6/64/Overview_of_Apache_Kafka.svg "Kafka basic concepts")

Simplifying there are three active elements:
- **Topics**: storage of the records (the data), they are hosted in the Apache Kafka cluster and can be partitioned
- **Producers**: entities producing records to be stored in one or more topics
- **Consumers**: entities receiving records from the topics

When a producer send a record to Apache Kafka cluster, the record will be sent to the consumers subscribed to the topics the producer is producing on: this is a classic pub-sub pattern.
Apache Kafka cluster adds the ability to store this information within the topic the producer has produced on, this feature guarantee that:
- an application consuming from the Apache Kafka cluster can hear only latest changes or position to a specific position in the past and start from that point to receive data
- the standard way to consume from Apache Kafka cluster is to start from the end (latest available record) or start from the beginning (first available record)

## How [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) works

An application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) is both a producer and a consumer at the same time:
- when an entity is created/updated/deleted (e.g. calling [SaveChanges](https://learn.microsoft.com/en-us/ef/core/saving/basic)) the provider will invoke the right producer to store a new record in the right topic of the Apache Kafka cluster
- then the consumer subscribed will be informed about this new record and will store it back: this seems not useful till now, but it will be more clear later

Apache Kafka cluster becams a:
1. a central routing for data changes in [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) based applications.
2. a reliable storage because, when the application restarts, the data stored in the topics will be read back from the consumers so the state will be aligned to the latest available.

Apache Kafka comes with [topic compaction](https://kafka.apache.org/documentation/#compaction) feature, thanks to it the point 2 is optimized.
[Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) is interested to store only the latest state of the entity and not the changes.
Using the [topic compaction](https://kafka.apache.org/documentation/#compaction), the combination of producer, consumer and Apache Kafka cluster can apply the CRUD operations on data:
- Create: a producer stores a new record with a unique key
- Read: a consumer retrieves records from topic
- Update: a producer storing a new record with a previously stored unique key will discard the old records
- Delete: a producer storing a new record with a previously stored unique key, and value set to null, will delete all records with that unique key

All CRUD operations are helped, behind the scene, from [`KNetCompactedReplicator`](https://github.com/masesgroup/KNet/blob/master/src/net/KNet/Specific/Replicator/KNetCompactedReplicator.cs) and/or [`KNetProducer`](https://github.com/masesgroup/KNet/blob/master/src/net/KNet/Specific/Producer/KNetProducer.cs)/[Apache Kafka Streams](https://kafka.apache.org/documentation/streams/).

### Data storage

Apache Kafka stores the information using records. It is important to convert entities in something usable from Apache Kafka.
The conversion is done using serializers that converts the Entities (data in the model) into Apache Kafka records and viceversa: see [serialization chapter](serialization.md) for more info.

## [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) compared to other providers

In the previous chapter was described how [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) permits to reproduce the CRUD operations.
Starting from the model defined in the code, the data will be stored in the topics and each topic can be seen as a table of a database filled in with the same data.
From the point of view of an application, the use of [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) is similar to the use of the InMemory provider.

### A note on [migrations](https://learn.microsoft.com/en-us/ef/core/managing-schemas/migrations)

The current version of [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) does not support [migrations](https://learn.microsoft.com/en-us/ef/core/managing-schemas/migrations).

## [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) features not available in other providers

Here a list of features [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) gives to its user and useful in some use cases.

### Distributed cache

In the previous chapter was stated that consumers align the application data to the last topics information.
The alignment is managed from [`KNetCompactedReplicator`](https://github.com/masesgroup/KNet/blob/master/src/net/KNet/Specific/Replicator/KNetCompactedReplicator.cs) and/or [Apache Kafka Streams](https://kafka.apache.org/documentation/streams/), everything is driven from the Apache Kafka back-end.
Considering two, or more, applications, sharing the same model and configuration, they always align to the latest state of the topics involved.
This implies that, virtually, there is a distributed cache between the applications and the Apache Kafka back-end:
- Apache Kafka stores physically the cache (shared state) within the topics and routes changes to the subscribed applications
- Applications use latest cache version (local state) received from Apache Kafka back-end

If an application restarts it will be able to retrieve latest data (latest cache) and aligns to the shared state.

### Events

Generally, an application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/), executes queries to the back-end to store, or retrieve, information on demand.
The alignment (record consumed) can be considered a change event: so any change in the backend produces an event used in different mode.
These change events are used from [`KNetCompactedReplicator`](https://github.com/masesgroup/KNet/blob/master/src/net/KNet/Specific/Replicator/KNetCompactedReplicator.cs) and/or [Apache Kafka Streams](https://kafka.apache.org/documentation/streams/) to align the local state.
Moreover [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) can inform, using callbacks and at zero cost, the registered application about these events.
Then the application can use the reported events to execute some actions:
- execute a query
- write something to disk
- execute a REST call
- and so on

### Applications not based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/)

Till now was spoken about applications based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/), however this provider can be used to feed applications not based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/).
[Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) comes with ready-made helping classes to subscribe to any topic of the Apache Kafka cluster to retrieve the data stored from an application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/).
Any application can use this feature to:
- read latest data stored in the topics from the application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/)
- attach to the topics involved from the application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) and receive change events upon something was produced

The ready-made helping classes upon a record is received, deserialize it and returns back the filled Entity.
8 changes: 5 additions & 3 deletions src/documentation/articles/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,14 @@
href: intro.md
- name: Getting started
href: gettingstarted.md
- name: How works
href: howitworks.md
- name: Usage
href: usage.md
- name: Use cases
href: usecases.md
- name: Template usage
href: usageTemplates.md
- name: Serialization
href: serialization.md
- name: External application
Expand All @@ -15,6 +19,4 @@
- name: Current state
href: currentstate.md
- name: KafkaDbContext
href: kafkadbcontext.md
- name: Template usage
href: usageTemplates.md
href: kafkadbcontext.md
70 changes: 69 additions & 1 deletion src/documentation/articles/usecases.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,72 @@
[Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) can be used in some operative conditions.
Here a possible, non exausthive list, of use cases.

TBD
Before read following chapters it is important to understand [how it works](howitworks.md).

## [Apache Kafka](https://kafka.apache.org/) as Database

The first use cases can be coupled to a standard usage of [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/), the same when it is used with database providers.
In [getting started](gettingstarted.md) is proposed a simple example following the online documentation.
In the example the data within the model are stored in multiple Apache Kafka topics, each topic is correlated to the `DbSet` described from the `DbContext`.

The constraint are managed using `OnModelCreating` of `DbContext`.

## [Apache Kafka](https://kafka.apache.org/) as distributed cache

Changing the mind a model is written it is possible to define a set of classes which acts as storage for data we want to use as a cache.
It is possible to build a new model like:
```cs
public class CachingContext : KafkaDbContext
{
public DbSet<SingleItem> Items { get; set; }
}

public class Item
{
public int ItemId { get; set; }
public string Data { get; set; }
}
```

Sharing it between multiple applications and allocating the `CachingContext` in each application, the cache is shared and the same data are available.

## [Apache Kafka](https://kafka.apache.org/) as a triggered distributed cache

Continuing from the previous use case, using the events reported from [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) it is possible to write a reactive application.
When a change event is triggered the application can react to it and take an action.

### SignalR

The triggered distributed cache can be used side-by-side with [SignalR](https://learn.microsoft.com/it-it/aspnet/signalr/overview/getting-started/introduction-to-signalr): combining [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) provider for [Apache Kafka](https://kafka.apache.org/) and [SignalR](https://learn.microsoft.com/it-it/aspnet/signalr/overview/getting-started/introduction-to-signalr) in an application, subscribing to the change events, it is possible to feed the connected applications to [SignalR](https://learn.microsoft.com/it-it/aspnet/signalr/overview/getting-started/introduction-to-signalr).

### Redis

The triggered distributed cache can be seen as a [Redis](https://redis.io/) backend.

## Data processing out-side [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/) application

The schema used to write the information in the topics are available, or can be defined from the user, so an external application can use the data in many mode:
- Using the feature to extract the entities stored in the topics outside the application based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/)
- Use some features of Apache Kafka like Apache Kafka Streams or Apache Kafka Connect.

### External application

An application, not based on [Entity Framework Core](https://learn.microsoft.com/it-it/ef/core/), can subscribe to the topics to:
- store all change events to another medium
- analyze the data or the changes
- and so on

### Apache Kafka Streams

Apache Kafka comes with the powerful Streams feature. An application based on Streams can analyze streams of data to extract some information or converts the data into something else.
It is possible to build an application, based on Apache Kafka Streams, which hear on change events and produce something else or just sores them in another topic containing all events not only the latest (e.g. just like the transaction log of SQL Server does it).

### Apache Kafka Connect

Apache Kafka comes with another powerful feature called Connect: it comes with some ready-made connector which connect Apache Kafka with other systems (database, storage, etc).
There are sink or source connectors, each connector has its own specificity:
- Database: the data in the topics can be converted and stored in a database
- File: the data in the topics can be converted and stored in one, or more, files
- Other: there are many ready-made connectors or a connector can be built using a [Connect SDK](https://github.com/masesgroup/KNet/blob/master/src/documentation/articles/connectSDK.md)

**NOTE**: While Apache Kafka Streams is an application running alone, Apache Kafka Connect can allocate the connectors using the distributed feature which load-balance the load and automatically restarts operation if something is going wrong.
3 changes: 2 additions & 1 deletion src/documentation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,11 @@ This project adheres to the Contributor [Covenant code of conduct](CODE_OF_CONDU
## Summary

* [Getting started](articles/gettingstarted.md)
* [How works](articles/howitworks.md)
* [Usage](articles/usage.md)
* [Use cases](articles/usecases.md)
* [Serialization](articles/serialization.md)
* [Templates usage](articles/usageTemplates.md)
* [Serialization](articles/serialization.md)
* [External application](articles/externalapplication.md)
* [Roadmap](articles/roadmap.md)
* [Current state](articles/currentstate.md)
Expand Down

0 comments on commit b9cfcbd

Please sign in to comment.