diff --git a/docs/components/best-practices/architecture/sizing-your-environment.md b/docs/components/best-practices/architecture/sizing-your-environment.md index 7bf37b4b39..4c800642b2 100644 --- a/docs/components/best-practices/architecture/sizing-your-environment.md +++ b/docs/components/best-practices/architecture/sizing-your-environment.md @@ -87,7 +87,7 @@ The payload size also affects disk space requirements, as described in the next The workflow engine itself will store data along every process instance, especially to keep the current state persistent. This is unavoidable. In case there are human tasks, data is also sent to Tasklist and kept there, until tasks are completed. -Furthermore, data is also sent Operate and Optimize, which store data in Elasticsearch. These tools keep historical audit data for some time. The total amount of disk space can be reduced by using **data retention settings**. We typically delete data in Operate after 30 to 90 days, but keep it in Optimize for a longer period of time to allow more analysis. A good rule of thumb is something between 6 and 18 months. +Furthermore, data is also sent from Operate and Optimize, which store data in Elasticsearch. These tools keep historical audit data for the configured retention times. The total amount of disk space can be reduced by using **data retention settings**. We typically delete data in Operate after 30 to 90 days, but keep it in Optimize for a longer period of time to allow more analysis. A good rule of thumb is something between 6 and 18 months. :::note Elasticsearch needs enough memory available to load a large amount of this data into memory. @@ -139,23 +139,35 @@ First, calculate your requirements using the information provided above, taking - Throughput: 20,000 process instances / day - Disk space: 114 GB -Now you can select a hardware package that can cover these requirements. In this example this fits well into a cluster of size S. +Now you can select a hardware package that can cover these requirements. In this example this fits well into a cluster of size 2x. ### Camunda 8 SaaS -Camunda 8 defines three fixed hardware packages you can select from. The table below gives you an indication what requirements you can fulfill with these. If your requirements are above the mentioned numbers, please contact us to discuss a customized sizing. +Camunda 8 defines four [cluster sizes](/components/concepts/clusters.md#cluster-size) you can select from (1x, 2x, 3x, and 4x) after you have chosen your [cluster type](/components/concepts/clusters.md#cluster-type). The following table gives you an indication of what requirements you can fulfill with each cluster size. -| **\*** | S | M | L | -| :----------------------------------------------------------------------- | ------------------------------: | ------------------------------: | -------------------------------: | -| Max Throughput **Tasks/day** | 5.9 M | 23 M | 43 M | -| Max Throughput **Tasks/second** | 65 | 270 | 500 | -| Max Throughput **Process Instances/day** | 0.5 M | 2.3 M | 4.3 M | -| Max Total Number of Process Instances stored (in Elasticsearch in total) | 100 k | 5.4 M | 15 M | -| Approx resources provisioned **\*\*** | 15 vCPU, 20 GB mem, 640 GB disk | 28 vCPU, 50 GB mem, 640 GB disk | 56 vCPU, 85 GB mem, 1320 GB disk | +:::note +Contact your Customer Success Manager if you require a custom cluster size above these requirements. +::: + +| Cluster size | 1x | 2x | 3x | 4x | +| :---------------------------------------------------------------------------------- | ---------------------------------: | ----------------------------------: | -------------------------------: | -------------------------------: | +| Max Throughput **Tasks/day** **\*** | 4.3 M | 8.6 M | 12.9 M | 17.2 M | +| Max Throughput **Tasks/second** **\*** | 50 | 100 | 150 | 200 | +| Max Throughput **Process Instances/day** **\*\*** | 3 M | 6 M | 9 M | 12 M | +| Max Total Number of Process Instances stored (in Elasticsearch in total) **\*\*\*** | 75 k | 150 k | 225 k | 300 k | +| Approximate resources provisioned **\*\*\*\*** | 11 vCPU, 22 GB memory, 64 GB disk. | 22 vCPU, 44 GB memory, 128 GB disk. | 33 vCPU, 66 GB mem, 192 GB disk. | 44 vCPU, 88 GB mem, 256 GB disk. | + +The numbers in the table were measured using Camunda 8 (version 8.6), [the benchmark project](https://github.com/camunda-community-hub/camunda-8-benchmark) running on its own Kubernetes Cluster, and using a [realistic process](https://github.com/camunda/camunda/blob/main/zeebe/benchmarks/project/src/main/resources/bpmn/realistic/bankCustomerComplaintDisputeHandling.bpmn) containing a mix of BPMN symbols such as tasks, events and call activities including subprocesses. To calculate day-based metrics, an equal distribution over 24 hours is assumed. + +**\*** Tasks (Service Tasks, Send Tasks, User Tasks, and so on) completed per day is the primary metric, as this is easy to measure and has a strong influence on resource consumption. This number assumes a constant load over the day. Tasks/day and Tasks/ second are scaled linearly. + +**\*\*** As Tasks are the primary resource driver, the number of process instances supported by a cluster is calculated based on the assumption of an average of 10 tasks per process. Customers can calculate a more accurate process instance estimate using their anticipated number of tasks per process. + +**\*\*\*** Total number of process instances within the retention period, regardless of if they are active or finished. This is limited by disk space, CPU, and memory for running and historical process instances available to ElasticSearch. Calculated assuming a typical set of process variables for process instances. Note that it makes a difference if you add one or two strings (requiring ~ 1kb of space) to your process instances, or if you attach a full JSON document containing 1MB, as this data needs to be stored in various places, influencing memory and disk requirements. If this number increases, you can still retain the runtime throughput, but Tasklist, Operate, and/or Optimize may lag behind. -**\*** The numbers in the table where measured using Camunda 8 (version 8.0) and [the benchmark project](https://github.com/camunda-community-hub/camunda-8-benchmark). It uses a [ten task process](https://github.com/camunda-community-hub/camunda-8-benchmark/blob/main/src/main/resources/bpmn/typical_process.bpmn). To calculate day-based metrics, an equal distribution over 24 hours is assumed. +Data retention has an influence on the amount of data that is kept for completed instances in your cluster. The default data retention is set to 30 days, which means that data that is older than 30 days gets removed from Operate and Tasklist. If a process instance is still active, it is fully functioning in runtime, but customers are not able to access historical data older than 30 days from Operate and Tasklist. Data retention is set to 6 months, meaning that data that is older than 6 months will be removed from Optimize. Up to certain limits data retention can be adjusted by Camunda on request. See [Camunda 8 SaaS data retention](/components/concepts/data-retention.md). -**\*\*** These are the resource limits configured in the Kubernetes cluster and are always subject to change. +**\*\*\*\*** These are the resource limits configured in the Kubernetes cluster and are always subject to change. You might wonder why the total number of process instances stored is that low. This is related to limited resources provided to Elasticsearch, yielding performance problems with too much data stored there. By increasing the available memory to Elasticsearch you can also increase that number. At the same time, even with this rather low number, you can always guarantee the throughput of the core workflow engine during peak loads, as this performance is not influenced. Also, you can always increase memory for Elasticsearch later on if it is required. @@ -163,7 +175,7 @@ You might wonder why the total number of process instances stored is that low. T Provisioning Camunda 8 onto your Self-Managed Kubernetes cluster might depend on various factors. For example, most customers already have their own teams providing Elasticsearch for them as a service. -However, the following example shows a possible configuration which is close to a cluster of size S in Camunda 8 SaaS, which can serve as a starting point for your own sizing. +However, the following example shows a possible configuration which is close to a cluster of size 1x in Camunda 8 SaaS, which can serve as a starting point for your own sizing. :::note Such a cluster can serve roughly 65 tasks per second as a peak load, and it can store up to 100,000 process instances in Elasticsearch (in-flight and history) before running out of disk-space. diff --git a/docs/components/concepts/clusters.md b/docs/components/concepts/clusters.md index fefb92cebc..527f34a700 100644 --- a/docs/components/concepts/clusters.md +++ b/docs/components/concepts/clusters.md @@ -6,68 +6,87 @@ description: "Learn more about the clusters available in your Camunda 8 plan." A [cluster](../../guides/create-cluster.md) is a provided group of production-ready nodes that run Camunda 8. -- **Enterprise** plan customers can create as many production or development clusters as they want based on their Enterprise agreement. -- **Starter** plan customers are limited based on the [fair usage limits of the plan](https://camunda.com/legal/fair-usage-limits-for-starter-plan/). +When [creating a cluster in SaaS](/components/console/manage-clusters/create-cluster.md), you can choose the cluster **type** and **size** to meet your organization's availability and scalability needs, and to provide control over cluster performance, uptime, and disaster recovery guarantees. -Production clusters come in three sizes: small (S), medium (M), and large (L). To learn more about the size of cluster best suited for your use case, refer to our [Best Practices](/components/best-practices/best-practices-overview.md) for more information on [sizing your runtime environment](/components/best-practices/architecture/sizing-your-environment.md#sizing-your-runtime-environment). +:::note -The following table shows each plan and available type or size of cluster: +Prior to 8.6, clusters were configured by hardware size (S, M, L). -| | Development | Production - S | Production - M | Production - L | -| ---------- | ----------- | -------------- | -------------- | -------------- | -| Free Trial | \- | X | \- | \- | -| Free | \- | \- | \- | \- | -| Starter | X | X | \- | \- | -| Enterprise | X | X | X | X | +- To learn more about clusters prior to 8.6, see previous documentation versions. +- To learn more about migrating your existing clusters to the newer model, contact your Customer Success Manager. -When you deploy and execute your [BPMN](/components/modeler/bpmn/bpmn.md) or [DMN](/components/modeler/dmn/dmn.md) models on a production cluster, this might impact your monthly (Starter) or annual (Enterprise) total fee, meaning the more you execute your models, the higher your total fee may be. +::: -## Free Trial cluster +## Cluster type -Free Trial clusters have the same functionality as a production cluster, but are size [small (S)](/components/best-practices/architecture/sizing-your-environment.md#camunda-8-saas) and only available during your trial period. You cannot convert a Free Trial cluster to a different kind of cluster. +The cluster type defines the level of availability and uptime for the cluster. -Once you sign up for a Free Trial, you are able to create one production cluster for the duration of your trial. +You can choose from three different cluster types: -When your Free Trial plan expires, you are automatically transferred to the Free Plan. This plan allows you to model BPMN and DMN collaboratively, but does not support execution of your models. Any cluster created during your trial is deleted, and you cannot create new clusters. +- **Basic**: A cluster for non-production use, including experimentation, early development, and basic use cases that do not require a guaranteed high uptime. +- **Standard**: A production-ready cluster with guaranteed higher uptime. +- **Advanced**: A production-ready cluster with guaranteed minimal disruption and the highest uptime. -### Auto-pause +### Cluster availability and uptime -Free Trial `dev` (or untagged) clusters are automatically paused eight hours after a cluster is created or resumed from a paused state. Auto-pause occurs regardless of cluster usage. +| Type | Basic | Standard | Advanced | +| :---------------------------------------------------------------------------- | :------------------------------------------------------------------------------------- | :-------------------------------------------------------- | :------------------------------------------------------------------------------------ | +| Usage | Non-production use, including experimentation, early development, and basic use cases. | Production-ready use cases with guaranteed higher uptime. | Production-ready use cases with guaranteed minimal disruption and the highest uptime. | +| Uptime Percentage
(Core Automation Cluster\*) | 99% | 99.5% | 99.9% | +| RTO/RPO\*\*
(Core Automation Cluster\*) | RTO: 8 hours
RPO: 24 hours | RTO: 2 hours
RPO: 4 hours | RTO: < 1 hour
RPO: < 1 hour | -You can resume a paused cluster at any time, which typically takes five to ten minutes to complete. See [resume your cluster](/components/console/manage-clusters/manage-cluster.md#resume-a-cluster). +

* Core Automation Cluster means the components critical for automating processes and decisions, such as Zeebe, Operate, Tasklist, Optimize and Connectors.

+

** RTO (Recovery Time Objective) means the maximum allowable time that a system or application can be down after a failure or disaster before it must be restored. It defines the target time to get the system back up and running. RPO (Recovery Point Objective) means the maximum acceptable amount of data loss measured in time. It indicates the point in time to which data must be restored to resume normal operations after a failure. It defines how much data you can afford to lose. The RTO/RPO figures shown in the table are provided on a best-effort basis and are not guaranteed.

-- Clusters tagged as `test`, `stage`, or `prod` do not auto-pause. -- Paused clusters are automatically deleted after 30 consecutive paused days. You can change the tag to avoid cluster deletion. -- No data is lost while a cluster is paused. All execution and configuration is saved, but cluster components such as Zeebe and Operate are temporarily disabled until you resume the cluster. +:::info +See [Camunda Enterprise General Terms](https://legal.camunda.com/licensing-and-other-legal-terms#camunda-enterprise-general-terms) for term definitions for **Monthly Uptime Percentage** and **Downtime**. +::: -:::tip +## Cluster size -To prevent auto-pause, you can: +The cluster size defines the cluster performance and capacity. -- Tag the cluster as `test`, `stage`, or `prod` instead of `dev`. -- [Upgrade your Free Trial plan](https://camunda.com/pricing/) to a Starter, Professional, or Enterprise plan. +After you have chosen your cluster type, you can choose the cluster size that best meets your cluster environment requirements. + +To learn more about choosing your cluster size, see [sizing your environment](/components/best-practices/architecture/sizing-your-environment.md#sizing-your-runtime-environment). + +- You can choose from four cluster sizes: 1x, 2x, 3x, and 4x. +- Larger cluster sizes include increased performance and capacity, allowing you to serve more workload. +- Increased usage such as higher throughput or longer data retention requires a larger cluster size. +- Each size increase uses one of your available cluster reservations. For example, purchasing two HWP advanced reservations for your production cluster allows you to configure two clusters of size 1x, or one cluster of size 2x. + +:::note + +Contact your Customer Success Manager to: + +- Increase the cluster size beyond the maximum 4x size. This requires custom sizing and pricing. +- Increase the cluster size of an existing cluster. ::: -## Development clusters +## Free Trial clusters -Development clusters, available in the Starter and Enterprise plans, are recommended for development, testing, proof of concepts, and demos. +Free Trial clusters have the same functionality as a production cluster, but are of a Basic type and 1x size, and only available during your trial period. You cannot convert a Free Trial cluster to a different type of cluster. -The way this type of cluster works varies depending on if you are using it in the Starter or the Enterprise plan. +Once you sign up for a Free Trial, you are able to create one production cluster for the duration of your trial. -### Development clusters in the Enterprise Plan +When your Free Trial plan expires, you are automatically transferred to the Free Plan. This plan allows you to model BPMN and DMN collaboratively, but does not support execution of your models. Any cluster created during your trial is deleted, and you cannot create new clusters. -Enterprise Plan users can purchase development clusters as part of their Enterprise subscription agreement. Deployment and execution of models (process instances, decision instances, and task users) are included at no extra cost for this type of cluster. Additionally, this type of cluster in the Enterprise plan follows the [standard data retention policy](/components/concepts/data-retention.md) and does not auto-pause when not in use. +### Auto-pause -Please [contact us](https://camunda.com/contact/) if you are an existing customer and would like to purchase a development cluster. +Free Trial `dev` (or untagged) clusters are automatically paused eight hours after a cluster is created or resumed from a paused state. Auto-pause occurs regardless of cluster usage. -### Development clusters in the Starter Plan +You can resume a paused cluster at any time, which typically takes five to ten minutes to complete. See [resume your cluster](/components/console/manage-clusters/manage-cluster.md#resume-a-cluster). -Starter Plan users have one **development cluster** with free execution for development included in their plan. Deployment and execution of models (process instances, decision instances, and task users) are provided at no cost. +- Clusters tagged as `test`, `stage`, or `prod` do not auto-pause. +- Paused clusters are automatically deleted after 30 consecutive paused days. You can change the tag to avoid cluster deletion. +- No data is lost while a cluster is paused. All execution and configuration is saved, but cluster components such as Zeebe and Operate are temporarily disabled until you resume the cluster. -Additional clusters can be purchased through your [billing reservations](/components/console/manage-plan/update-billing-reservations.md). +:::tip -Additionally in the Starter Plan, the following applies to **development clusters**: +To prevent auto-pause, you can: -- **Cluster is not highly available & includes less hardware**: Reduced hardware resources and availability compared to production cluster (for example, one Zeebe node only). -- **Shorter history of processes and decisions**: Data retention in Operate, Optimize, and Tasklist is reduced to one day. For example, pending or historical process instances are deleted after one day as per the [fair usage limits of the Starter plan](https://camunda.com/legal/fair-usage-limits-for-starter-plan/). +- Tag the cluster as `test`, `stage`, or `prod` instead of `dev`. +- [Upgrade your Free Trial plan](https://camunda.com/pricing/) to a Starter or Enterprise plan. + +::: diff --git a/docs/components/console/manage-clusters/create-cluster-include.md b/docs/components/console/manage-clusters/create-cluster-include.md index 981eab0fe3..c78d6e5249 100644 --- a/docs/components/console/manage-clusters/create-cluster-include.md +++ b/docs/components/console/manage-clusters/create-cluster-include.md @@ -1,29 +1,44 @@ --- --- -To deploy and run your process, you must create a cluster in Camunda 8. +To deploy and run your process, you must create a [cluster](/components/concepts/clusters.md) in Camunda 8. 1. To create a cluster, navigate to **Console**, click the **Clusters** tab, and click **Create new cluster**. -1. Name your cluster. For the purpose of this guide, we recommend using the **Stable** channel and the latest generation. -1. Select your [region](/docs/reference/regions.md). -1. Select your [encryption at rest protection level](/docs/components/concepts/encryption-at-rest.md) (enterprise only). +1. Name your cluster. +1. Select a [cluster type](/components/concepts/clusters.md#cluster-type) and [cluster size](/components/concepts/clusters.md#cluster-size). +1. Assign a cluster tag to indicate what type of cluster it is. +1. Select your [region](/reference/regions.md). +1. Select your [encryption at rest protection level](/components/concepts/encryption-at-rest.md) (enterprise only). +1. Select a channel and release. For the purpose of this guide, we recommend using the **Stable** channel and the latest generation. 1. Click **Create cluster**. 1. Your cluster will take a few moments to create. Check the status on the **Clusters** page or by clicking into the cluster itself and looking at the **Applications** section. :::note - If you haven't created a cluster yet, the **Clusters** page will be empty. -- Even while the cluster shows a status **Creating**, you can still proceed to begin modeling. +- You can start modeling even if the cluster shows a **Creating** status. ::: +![cluster-creating-modal](./img/cluster-creating-modal.png) + +1. After creating the cluster, you can view the new entry in the **Clusters** tab: + + ![cluster-creating](./img/cluster-overview-new-cluster-creating.png) + +2. The cluster is now being set up. During this phase, its state is **Creating**. After one or two minutes, the cluster is ready for use and changes its state to **Healthy**: + + ![cluster-healthy](./img/cluster-overview-new-cluster-healthy.png) + +3. After the cluster is created, click on the cluster name to visit the cluster detail page. + ## Development clusters Starter Plan users have one **development cluster**, with free execution for development included in their plan. Deployment and execution of models (process instances, decision instances, and task users) is provided at no cost. Additional clusters can be purchased through your [billing reservations](/components/console/manage-plan/update-billing-reservations.md). -Visit the [clusters page](/components/concepts/clusters.md) to learn more about the differences between **development clusters** and **production clusters**. +To learn more about the differences between **development clusters** and **production clusters**, see [clusters](/components/concepts/clusters.md). - **Stable**: Provides the latest feature and patch releases ready for most users at a minimal risk. The releases follow semantic versioning and can be updated to the next minor or patch release without data loss. - **Alpha**: Provides preview releases in preparation for the next stable release. They provide a short-term stability point to test new features and give feedback before they are released to the stable channel. Try these to ensure the upcoming release works with your infrastructure. These releases cannot be updated to a newer release, and therefore are not meant to be used in production. @@ -37,14 +52,3 @@ Only organization owners or users with the **Admin** role in Console can deploy Users without **Admin** roles can deploy only on `dev`, `test`, or `stage` clusters. ::: -![cluster-creating-modal](./img/cluster-creating-modal.png) - -1. After you've made your selection and created the cluster, view the new entry in the **Clusters** tab: - -![cluster-creating](./img/cluster-overview-new-cluster-creating.png) - -2. The cluster is now being set up. During this phase, its state is **Creating**. After one or two minutes, the cluster is ready for use and changes its state to **Healthy**: - -![cluster-healthy](./img/cluster-overview-new-cluster-healthy.png) - -3. After the cluster is created, click on the cluster name to visit the cluster detail page. diff --git a/docs/components/console/manage-clusters/img/cluster-creating-modal.png b/docs/components/console/manage-clusters/img/cluster-creating-modal.png index c76f36051f..b30cf8efcf 100644 Binary files a/docs/components/console/manage-clusters/img/cluster-creating-modal.png and b/docs/components/console/manage-clusters/img/cluster-creating-modal.png differ