Replies: 1 comment 4 replies
-
Before you go code spelunking, these blog posts are a good source of information on the Prometheus TSDB format which is what Mimir/Cortex/Thanos use: |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I'm evaluating Mimir versus Amazon Managed Prometheus so have been doing some cost forecasting. I appreciate the insight you've provided in two places on that front:
When modeling costs, the biggest driver I can see with Mimir is not CPU/memory/disk, or storage, but the actual count of PUTs/GETs to S3 since those requests cost money:
For example, if we have 2000 active time series, a replication factor of 3, and ingestors write 5 files per hour for each time series, that's ~$100 in S3 PUTs per month:
GETs could also get expensive, but I understand that heavy caching is used to mitigate that.
Is the logic above correct? Or can blocks consolidate multiple time series into one, minimizing the total number of S3 PUTs? Is there any more available information (eg benchmarks) on typical S3 GET/PUT counts per active time series?
Basically, is there any benchmarking data or rules of thumb for roughly what the de-amplifcation factor discussed winds up being:
More info, the Cortex documentation seems to suggest that multiple time series' samples are consolidated to the same block(s), which I believe may further reduce the number of GETs/PUTs. But I haven't seen what that de-amplifcation ratio winds up being.
PS: will go code spelunking here in a second, but am also making this as a discussion to highlight that it may be worth addressing the above in docs
Beta Was this translation helpful? Give feedback.
All reactions