Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parallelization in bloom_build component #15496

Open
emadolsky opened this issue Dec 19, 2024 · 0 comments
Open

Implement parallelization in bloom_build component #15496

emadolsky opened this issue Dec 19, 2024 · 0 comments

Comments

@emadolsky
Copy link
Contributor

Is your feature request related to a problem? Please describe.
We have tenants that have tens of terrabytes of data per day. Building blooms for these tenants require a lot of resource, but each of bloom_build components is only able to utilise one cpu core as there is no parallelization in place. So the only way to overcome this issue is to deploy a huge number of small pods with one core cpu to satisfy the needs. But since we use Simple Scalable Deployment mode, scaling out bloom_build means scaling out backend component which makes the issue more complex and annoying.

Describe the solution you'd like
Having multiple workers per bloom_build to process planned tasks.

Describe alternatives you've considered
Scaling out with small pods which means 100 pods for 100 cores which is unpleasant.

@emadolsky emadolsky changed the title Implement parallelization on bloom_build component Implement parallelization in bloom_build component Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant