-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce GCP Fixed Costs by 50% #2453
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Are we sure that the e2 is suitable for the user and worker node groups? It seems like for the worker group especially, they might handycap performance. Additionally, they do not support GPU's. I think it might be better to run the General node group on the e2-highmem-4 and the user and worker node groups on the n4-standard-4. Especially with them scaling down to zero now, I think that would be an acceptable tradeoff for performance vs price.
Great points @dcmcand , I looked into it a bit more. It looks like CPU performance (Coremark score) is roughly equal between the 2 types. Also, for GPU instances, we usually create new node groups specifically for the gpu profiles although it is not fully documented at the moment e.g. (image below) so I don't see that as an issue for the user or worker node defaults since they would not use these default node groups for their gpu instances. The only disadvantage I see is the maximum egress is down from 10 to 8 Gbps, but I believe the cost savings is worth the 20% reduction in bandwidth for most users though it's just a hunch. |
@dcmcand any further concerns or comments? |
I still feel a bit of concern, specifically around the dask worker. However, I have no data to actually justify by concern. If someone upgrades and then applies the new config, it will result in the nodes being replaced. Do we have any concerns about that? I also feel like we should make sure we document this change and how to restore the original functionality. Maybe in the FAQ? But probably also in the release notes. |
We've added node types to the nebari config that is created when running
I'll document it in the Nebari upgrade command so that users will be notified if this will affect them and what they need to add to their config so that it won't affect them, and we can copy something similar to the release notes. |
sounds good. thanks @Adam-D-Lewis |
We don't know what the next Nebari version will be at the moment (2024.5.2 vs 2024.6.1) so I opened a separate PR and assigned it to the 2024.5.2 milestone. #2466. My thought is that we merge this as is and make sure to merge the other in with the appropriate version number during the next release step. |
Reference Issues or PRs
Fixes #2452
What does this implement/fix?
Put a
x
in the boxes that applyTesting
Any other comments?