-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved Self-Managed Enterprise scaling & troubleshooting #38863
Conversation
Hesperide
commented
Jun 3, 2024
- Adds support documentation addressing many recent customer inquiries
- Adds docs from conversations on scaling last week
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely detailed guide!
@@ -9,12 +9,11 @@ import TabItem from '@theme/TabItem'; | |||
|
|||
[Airbyte Self-Managed Enterprise](./README.md) is in an early access stage for select priority users. Once you [are qualified for a Self-Managed Enterprise license key](https://airbyte.com/company/talk-to-sales), you can deploy Airbyte with the following instructions. | |||
|
|||
Airbyte Self-Managed Enterprise must be deployed using Kubernetes. This is to enable Airbyte's best performance and scale. The core components \(api server, scheduler, etc\) run as deployments while the scheduler launches connector-related pods on different nodes. | |||
Airbyte Self-Managed Enterprise must be deployed using Kubernetes. This is to enable Airbyte's best performance and scale. The core Airbyte components (`server`, `webapp`, `workload-launcher`) run as deployments. The `workload-launcher` is responsible for managing connector-related pods (`check`, `discover`, `read`, `write`, `orchestrator`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to expand on the first sentence and hint at more specific requirements listed later? Something like "Airbyte Self-Managed Enterprise must be deployed on a Kubernetes cluster managed with Helm. This is to enable Airbyte's best performance and scale. We support enterprise deployments on AWS, GCP, or Azure using services outlined in this document."
memory: | ||
``` | ||
|
||
If your Airbyte deployment is underprovisioned, you may often notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. If you begin to see such errors, we recommend you follow the troubleshooting steps above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your Airbyte deployment is underprovisioned, you may often notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. If you begin to see such errors, we recommend you follow the troubleshooting steps above. | |
If your Airbyte deployment is underprovisioned, you may notice occasional 'stuck jobs' that remain in-progress for long periods, with eventual failures related to unavailable pods. If you begin to see such errors, we recommend you follow the troubleshooting steps above. |
|
||
## DEBUG Logs | ||
|
||
We recommend turning off `DEBUG` logs for any non-testing use of Self-Managed Airbyte. Failing to do while running at-scale syncs may result in the `server` pod being overloaded, preventing most of the deployment for operating as normal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We recommend turning off `DEBUG` logs for any non-testing use of Self-Managed Airbyte. Failing to do while running at-scale syncs may result in the `server` pod being overloaded, preventing most of the deployment for operating as normal. | |
We recommend turning off `DEBUG` logs for any non-testing use of Self-Managed Airbyte. Failing to do so while running at-scale syncs may result in the `server` pod being overloaded. This would prevent most of the deployment from operating as normal. |
|
||
## Schema Discovery Timeouts | ||
|
||
While configuring a database source connector with hundreds to thousands of tables, each with many columns, the one-time `discover` mechanism - by which we discover the topology of your source - may run for a long time and exceed Airbyte's timeout duration. Should this be the case, you may increase Airbyte's timeout limit as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While configuring a database source connector with hundreds to thousands of tables, each with many columns, the one-time `discover` mechanism - by which we discover the topology of your source - may run for a long time and exceed Airbyte's timeout duration. Should this be the case, you may increase Airbyte's timeout limit as follows: | |
Airbyte uses a one-time `discover` mechanism to map out the topology of your source. If a database source connector has hundreds or even thousands of tables, each with many columns, `discover` may run for long enough that it exceeds Airbyte's timeout duration. In such a case, you may increase Airbyte's timeout limit as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be useful to link to the Airbyte protocol definition for "Discover" as well: understanding-airbyte/airbyte-protocol#discover