Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSSION] : Consideration of Using Kafka for Streaming Data to Clients #4

Closed
zakhaev26 opened this issue Dec 18, 2023 · 17 comments
Closed
Assignees
Labels
discussion help wanted Extra attention is needed P-high question Further information is requested server

Comments

@zakhaev26
Copy link
Member

zakhaev26 commented Dec 18, 2023

The goal of the GCSB project is to develop a robust system with multiple independent and decoupled APIs for sports. Currently, Server-Sent Events (SSE) have been identified as a suitable choice for achieving real-time comm. from server-->client due to their lightweight nature and ease of setup,but there are certain concerns that needs to be solved.

Key Concerns:

  1. API Design Uniformity:
    How should we design the APIs to ensure uniformity across the project? Should we opt for a single SSE or individual SSE for each API ?
    We are aiming for a uniform approach that can enhance consistency and ease maintenance.
  • Single vs. Individual SSE: Trade-offs between having a single SSE for all APIs or individual SSE for each API.
  • Design Principles: We need to brainstorm on design principles to be followed for API endpoints, naming conventions, and response formats.
  1. Need for Kafka/Similar Queue or Pub-Sub Messaging:

    Question: Considering a maximum of 1000 concurrent users at worst case, do we really need a queuing or pub-sub architecture like Kafka/RabbitMQ? What are the pros and cons? Can SSE alone can handle the expected load or if a more scalable solution is necessary?

Please share your thoughts, concerns, and suggestions regarding the API design uniformity and the need for a queuing/pub-sub architecture in this issue thread.

Consider this as a High Priority Issue.

@zakhaev26 zakhaev26 added server P-high discussion help wanted Extra attention is needed question Further information is requested labels Dec 18, 2023
@majorbruteforce
Copy link
Member

majorbruteforce commented Dec 18, 2023

Kafka or other pub-sub services facilitate high throughput read and writes between 'systems'. They don't inherently support the client-serving server to deal with a large number of requests (they are probably not made for that purpose). Keeping the project dynamics in mind, we in fact don't have a large throughput to deal with; rather, we need to efficiently serve the data to a large volume of clients concurrently, reliably and in real-time. In my opinion, we should look into load balancing the servers and scaling them when require along with the usage of Redis to cache the data. We can build a system and test it using something like Apache JMeter. We should also test to first determine the degree of resilience our system requires so we don't overengineer it.

@punitkr03
Copy link
Collaborator

@majorbruteforce I was having the same thought. We can work with caching as we do not need to have high data throughput. We will keep the cache in sync with the database in set intervals thus reducing complexity.

@zakhaev26
Copy link
Member Author

zakhaev26 commented Dec 19, 2023

We have 2 choices as of now that cater to this need :
M1. Using Database Changestreams to track changes and emit those via SSE Server
M2. Having a Centralized Kafka Cluster having multiple kafka servers - Producers(admins) publishing messages to the partitions and consumers picking them up , process em up and emit via SSE.

Pros of M1 : Simple to setup
Cons of M1: Scalability

Pros of M2: Reliable,Fault tolerant,scalable,can help in making unified system
Cons of M2: Hard Learning curve, Maintainence overheads.
Eg : A 2 sport pub-sub system:
image

@zakhaev26
Copy link
Member Author

zakhaev26 commented Dec 19, 2023

I beleive ~1k connections can be managed by both the Methods.
According to me , we should prioritize building the system with MongoDB Changestreams as of now as scalability is not really our need.
We might not want to overengineer and complicate things,although if there is a requirement,please tell
@majorbruteforce @punitkr03 Your thoughts?

@punitkr03
Copy link
Collaborator

Seems complicated. Let me do some research.

@punitkr03
Copy link
Collaborator

@majorbruteforce @zakhaev26 I did some digging and according to me using kafka will ensure stability in the long run if the project scales up. So it can be redundant. Also we have much time to implement it. I am all in on kafka implementation.

@zakhaev26
Copy link
Member Author

zakhaev26 commented Dec 20, 2023

I am also interested in using Kafka..
@majorbruteforce @Brijendra-Singh2003 ?

@zakhaev26
Copy link
Member Author

zakhaev26 commented Dec 20, 2023

Did some benchmarks to test out reliability of Changestreams v/s Kafka
These runs were performed on :

  • 4 Core i5-6200U CPU @ 2.30GHz Ubuntu 22.04.3 LTS
  • Go v1.21.5
  • Kafka 3.6.1
  • Zookeeper 3.9.1

Test Scenario:

  • Concurrent writes to a MongoDB collection monitored by Change Streams.
  • Concurrent writes to a MongoDB & Concurrent publishing of messages in Apache Kafka
    NOTE:Responses (inserted output) was serialized in JSON and sent to client + I used Postman Runner to perform 100 Iterations with a Delay of 0ms for both the cases

Outcome :

Functional Benchmarks :

Apache Kafka Avg Response Time : ~364ms with all iterations being a success
Screenshot from 2023-12-21 04-42-39

Mongo Changestreams Avg Response Time:
Trial 1: ~741ms. [Freezes & Fails after 8 Iterations ]
Screenshot from 2023-12-21 04-45-42
Trial 1: ~693ms. [Freezes & Fails after 16 Iterations ]
Screenshot from 2023-12-21 04-48-38

Performance Benchmarks :

10 VUs for 1min Fixed Load

  • Kafka :

Screenshot from 2023-12-21 05-42-00

  • ChangeStreams :
    Screenshot from 2023-12-21 05-51-01

100 VUs for 1min Fixed Load

  • Kafka
    (My Laptop shuts down everytime I perform the test. XD)
    But a Screenshot towards the end of test:
    Avg Time : 1755ms
    image

  • Mongo Changestreams
    Avg Time : 2946ms
    Screenshot from 2023-12-21 05-58-59

Does this mean we shouldn't choose changestreams over kafka ? Nope
As these are Admin Updates.I don't think there would be 100 Admins / even 2 Admins at a single session to Upload Scores in Server.In that case Both are doable

But after these tests kafka seems good

@majorbruteforce @punitkr03 @Brijendra-Singh2003

I definitely didn't feel like primeagen after performing these tests :p

Source : https://github.com/zakhaev26/microservices-go

@zakhaev26
Copy link
Member Author

P.S: I did try out the most popular Kafka Library for JS , but it was slower to interact with Kafka IMO, whereas the confluentinc-kafka library for golang felt way more faster, even on a single thread.

So even if we are planning to use kafka , we need to make sure it works fine / manageable with Node so that devs can work with JS as well
Performance metrics alone don't tell the full story; practical integration and developer experience matters

@majorbruteforce
Copy link
Member

majorbruteforce commented Dec 21, 2023

I am also interested in using Kafka.. @majorbruteforce @Brijendra-Singh2003 ?

I am looking to create a load balanced system that caches the data using Redis. I will try to test how many SSE connections a server with standard specifications can handle.

@zakhaev26
Copy link
Member Author

What's the progress Jesse? @majorbruteforce 🕺

@majorbruteforce
Copy link
Member

While building a two-layer system with a cache layer, I realized it makes no sense to use a cache for an application which has to update data constantly. Revalidating the cache so frequently is no better than broadcasting changes directly from change streams. I am trying to test a few mores ways like polling to see how they compare. I will run some benchmarks and start working on building the main APIs soon.

@majorbruteforce
Copy link
Member

Also, @zakhaev26 try running the benchmarks for read operations once. I will do the same. The system is going to have bulk reads rather than writes.

@zakhaev26
Copy link
Member Author

The system is going to have bulk reads rather than writes.

Genuine

@punitkr03
Copy link
Collaborator

A basic implementation of chess api architecture.Screenshot_20231226_161024_Samsung Notes.jpg

@zakhaev26
Copy link
Member Author

zakhaev26 commented Dec 29, 2023

I used Sarama Library instead of Confluentinc one for interacting with Kafka,I felt it is a more reliable way of use producer + subscriber and is extremely fast
Look into it if writing in Go : Sarama

@majorbruteforce
Copy link
Member

Kafka is tried and the best option to proceed with for events streaming as tested by @zakhaev26. Discussions regarding implementation of the same will be done on #38 from hereon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion help wanted Extra attention is needed P-high question Further information is requested server
Projects
None yet
Development

No branches or pull requests

4 participants