-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Support attributing costs to individual hubs automatically on Openscapes #4453
Comments
For storage costs, we will switch to one EFS per hub. This doesn't particularly have cost implications, because AWS EFS is per use. I was going to suggest we move to multiple nodepools for cost monitoring, but turns out AWS actually has done a pretty decent job of 'splitting costs' per namespace! https://aws.amazon.com/blogs/aws-cloud-financial-management/improve-cost-visibility-of-amazon-eks-with-aws-split-cost-allocation-data/. I'll have a spike specc'd out soon to determine how to do this. |
The spike was completed in #4453, with the outcome that:
I've refined and added tasks to move each hub to its own dedicated nodepool. |
This is great @yuvipanda - let me know how I can help! |
Instead of drilling down this further, I have written out a more detailed definition of done, and will work with @consideRatio in having him do just enough refinement to complete the tasks. Definition Of doneThere exists a grafana dashboard that looks like this: DetailsNumbers in purple indicate priority ordering, helpful for scoping conversations. Fixed costs include core nodepool, any PV needed for support chart or hub databases. Kubernetes master API costs and cost for any load balancer services if they lost money). Note that tagging the EKS cluster itself requires recreating it, which we don’t wanna do. Other active tags can be used to include that information though. Object storage is all S3 related cost from the scratch and persistent buckets, not counting requestor pays. "Compute" is all ec2 cost, including root disks, networking and gpu. Home directory should include home directory and backup costs. Total cost should include all 2i2c managed infrastructure. ValidationEach of these graphs need to be validated so we can trust them and find pieces we have missed, as well as spot bugs in the Athena query.
TimelineI would like this to be done within the next 3 sprints (so 2 full sprints with Erik available). We can cut scope as needed. Next steps
|
@ateucher today pointed me to https://docs.aws.amazon.com/cost-management/latest/userguide/ce-api.html, which I had totally missed while doing #4465. I think the lesson for me is that I should hand off at the level in #4453 (comment) earlier, and rely on others to do such spikes. Regardless, I think it's early enough that we should investigate this alternative to Athena. It would involve:
There are a few major advantages over using Athena:
So if possible, we should prefer this method. We can resuse all the work we had done, except for some parts of #4546. Next step here is to design a spike to validate this (instead of #4544). The athena specific issues that are subtasks of this can be closed if we are going to take this approach. Instead of doing the refinement work myself, I'm going to take a slightly different approach here, and not write out the spike myself. Instead I'll work with @consideRatio in helping him both scope out and accomplish this work. |
It does have this limitation:
While athena does not. |
While working #4713 and #4712, I've taken these notes: Summary
NotesWanted accounting details
Overview of tagsUse of the AWS tag editor helped figure these things out:
Accounting for known 2i2c infra totalBased on a given cluster name, such as
Still not accounted costsThese costs for openscapes August month 2024, greater than 1 USD, aren't USW2-PublicIPv4:InUseAddress: $11.68 We have public IPs from three sources:
Public IPs costs $0.005/hour, so this becomes 24*0.005 == 0.12 per public IP My guess is that we aren't attributing costs for the NAT Gateway IP, or the k8s
USW2-WarmStorage-ByteHrs-EFS: $3.76 This seems associated with backup, because there is a concept between warm / We have an automated backup vault, but it isn't tagged by anything. At the same Anyhow, I think this isn't worth further investigation. Accounting for hub attributed costs
Accounting for hub attributed costsLike for hub attributed costs, but also grouping by service types and then |
This is now in a sufficiently functional state for openscapes people to start looking at I think. It can be viewed at https://grafana.openscapes.2i2c.cloud/d/edw06h7udjwg0b/cloud-cost-attribution?orgId=1. |
Closing as completed, this is functional for openscapes, documentation on scaling this to other hubs was something I considered not to be part of the openscapes focused epic when being asked to provide a definition of done for this. Such future steps are now tracked in #4872. |
As part of [Initiative] Hub Scale Cost Monitoring #4384, we want to support attributing costs to individual hubs on AWS.
We don't want to do this on all hubs on all clusters, but need to pick a cluster that has multiple hubs already in it to attribute costs. Let's pick openscapes - it has a staging and prod hub, but also a workshop hub!
While this EPIC is focused on openscapes, at the end of it, it would allow us to know exactly what we would need to do to do the same on any other cluster.
Tasks
staging
hub on openscapes hub to its own EFS #44612i2c:hub-name
tags to buckets #4710To meet the definition of done
Definition of done
The text was updated successfully, but these errors were encountered: