Examine real-world scaling scenarios #67

litalmason · 2023-12-24T16:43:31Z

Description

We want to observe different setups of scaling with different concurrency levels.
This will be done by simulating many clients, and observing server pods scaling up.
We would like to examine the impacts of PQC on such common use cases, measuring memory, CPU, throughput and more.

Acceptance Criteria

Real-world scaling scenarios

We need to test the following scenarios by minimum.

Low traffic applications

Light Traffic E-commerce
• Number of Requests: 100
• Request Size: Small (e.g., 1KB)
• Concurrency: 20
• Scenario: Simulate an e-commerce application during off-peak hours.

Medium traffic applications

IoT Device Control
• Number of Requests: 300
• Request Size: Tiny (e.g., 100 bytes)
• Algorithms: Quantum-Safe, Hybrid, Classic
• Concurrency: 30
• Scenario: Evaluate the impact of algorithms on real-time control and monitoring of IoT devices.
Healthcare Records Access
• Number of Requests: 500
• Request Size: Medium (e.g., 1MB)
• Concurrency: 50
• Scenario: Simulate a healthcare application for accessing patient records.
Social Media Surge
• Number of Requests: 1000
• Request Size: Medium (e.g., 1MB)
• Concurrency: 100
• Scenario: Emulate a social media platform during a viral event or trending topic.

High traffic applications

Online Banking Transactions
• Number of Requests: 2000
• Request Size: Medium (e.g., 1MB)
• Concurrency: 200
• Scenario: Assess the impact of algorithms on the security and speed of financial transactions.
Ride-Sharing Peak Hours
• Number of Requests: 3000
• Request Size: Tiny (e.g., 100 bytes)
• Algorithms: Quantum-Safe, Hybrid, Classic
• Concurrency: 300
• Scenario: Simulate a ride-sharing app during rush hours in a busy city.

Very high traffic applications

Online Retail Peak Sale
• Number of Requests: 5000
• Request Size: Large (e.g., 10MB)
• Concurrency: 500
• Scenario: Simulate an online retail store during a peak shopping season or sale event.
Video Streaming Service
• Number of Requests: 10,000
• Request Size: Large (e.g., 10MB)
• Algorithms: Quantum-Safe, Hybrid, Classic
• Concurrency: 1000
• Scenario: Evaluate how well your application handles a surge in video streaming requests during a major live event.
Content Delivery Network (CDN)
• Number of Requests: 20,000
• Request Size: Very Large (e.g., 100MB+)
• Algorithms: Quantum-Safe, Hybrid, Classic
• Concurrency: 2000
• Scenario: Measure how algorithms affect the speed and efficiency of delivering very large media files during a global event.

Tasks

K8S mode - run qujata-curl as DaemonSet
Init app concurrency when app is loaded
Analyze api - support working with multiple curl pods
New metric for percentage of cpu usage
Analyze api - new parameter of concurrency
scale up nginx pods
(More tasks to be added, this does not cover everything)

litalmason · 2024-02-01T23:06:55Z

Design notes with @yeudit - Concurrency input and scalability

General idea: We would like to take into account the user input "Concurrency" which will specify how many concurrent users (or clients) we'd like to simulate.
To achieve concurrency for the specified use cases, we need to scale up our K8S running on Azure (AKS).
Environment: We will run our experiments on a Linux based AKS, with a predefined number of nodes.
Each node increases the number of concurrency that we can achieve.
We may use all nodes in the cluster to maximize the concurrency. In some cases, when the desired concurrency is low, we will not need to use all nodes (see dashed line), which will simulate a use case of small-medium traffic applications.

Important to know:

Concurrency depends on the underlying CPU specs.
Concurrency will not be improved when creating new pods in the same node, if the existing ones already take up the max capacity of the node's CPU cores.
Concurrency will be improved when adding nodes in the K8S cluster. Adding more nodes = more CPU cores.

Design decisions made:

When using Docker we will currently support 1 Curl container and 1 NGINX container only.
When using Kubernetes, for operational simplicity, we deploy Curl pods as a DaemonSet, one per node. The Curl pod that shares the same node with the NGINX pod won't be in use.
When the app is initiating, we need to calculate the max concurrency of the environment we are running on:

Run 10000 (configurable) iterations with algorithm prime256v1 (configurable) on a pod/docker container.
Max concurrency = Iteration / Total run time

Research question:

When NGINX pods increase due to significant traffic, what will happen to the memory and cpu? when using PQC / hybrid algorithms, will we see a faster increase of pods?
What will be the throughput values when using PQC/hybrid algorithms?

litalmason added the feature New feature label Dec 24, 2023

litalmason added this to the 2.0.0 milestone Dec 24, 2023

litalmason modified the milestones: 2.0.0, 1.1.0, 1.2.0 Jan 15, 2024

litalmason modified the milestones: 1.2.0, 2.0.0 Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examine real-world scaling scenarios #67

Examine real-world scaling scenarios #67

litalmason commented Dec 24, 2023 •

edited

Loading

litalmason commented Feb 1, 2024 •

edited

Loading

Examine real-world scaling scenarios #67

Examine real-world scaling scenarios #67

Comments

litalmason commented Dec 24, 2023 • edited Loading

Description

Acceptance Criteria

Real-world scaling scenarios

Low traffic applications

Medium traffic applications

High traffic applications

Very high traffic applications

Tasks

litalmason commented Feb 1, 2024 • edited Loading

litalmason commented Dec 24, 2023 •

edited

Loading

litalmason commented Feb 1, 2024 •

edited

Loading