Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bucketweb: Show block size in details pane #7205

Closed
outofrange opened this issue Mar 13, 2024 · 4 comments · Fixed by #7233
Closed

bucketweb: Show block size in details pane #7205

outofrange opened this issue Mar 13, 2024 · 4 comments · Fixed by #7233

Comments

@outofrange
Copy link
Contributor

outofrange commented Mar 13, 2024

Is your proposal related to a problem?

I would like to be able to easily judge

  • what blocks (clusters / block streams) take up the most space
  • how much impact different resolutions have on storage space
  • how much impact compaction levels have
  • if deduplication has an effect
  • how effective dropping series via bucket rewrite was
  • what old blocks to delete when freeing up space is required

I botched together a simple script that queries S3, but I'm wondering if this would be a good addition in Thanos Bucketweb?

Describe the solution you'd like

Bucketweb already shows details for blocks when clicking on one.
Currently, the details pane will show infos taken from meta.json, like time range, duration, counts for series/samples/chunks, resolution, level, source and labels.

I'd like to also be able to see a human readable bytes presentation of the actually used storage space, both for metric data and index.

This would require

  • the backend to query this information from S3 for each block, either once when syncing metadata (preferred I guess), or once per request
  • the frontend to present the space usage for selected blocks

Describe alternatives you've considered

Doing it with custom scripts is possible, but having it in the official UI would be great.

I also thought about something like a Thanos Bucket Prometheus Exporter to provide more metadata via Prometheus, but not as nice as having it directly in the UI.

To bring it to the UI, my approach would be to retrieve used space from S3 within bucketweb. As far as I know, this requires multiple (maybe recursive?) calls, as there is no simple S3 API for "calculate total bytes used by directory XYZ".
This step could be avoided by adding a field like totalSize (and maybe something like metricSize and indexSize as well) to meta.json when producing the block - this might be the better solution, especially when other tools could use this info too, but I can't say if touching a component that central is "worth it".

Additional context

With this information available, other presentations would be possible as well, like summing the size of all blocks to show the total usage per block stream / UI row

Edit:
Oh, as this seems to be simple enough for a Go noob, I can imagine to work on a PR (if / when I find time...)!

@douglascamata
Copy link
Contributor

@outofrange I believe you can fetch information about each block inside their meta.json file, including the byte size of the block's index and chunk files. With this information you shouldn't need to do a recursive list in the object storage to calculate size.

I often download the meta.json file through the Compactor UI to look at the size of chunks and indexes. Would be awesome to have this information there directly in the page.

Feel free to work on this and come over to #thanos-dev in the CNCF slack if you need any help.

@outofrange
Copy link
Contributor Author

@douglascamata PR is open :)

I added values I'd benefit from:

  • total size of blocks
  • size of all chunks and size of index - also in relation to total size, to get a feeling for index overhead on different compaction levels
  • daily growth to estimate future storage requirements and compare different compaction levels
  • sum of all block sizes per source

Not sure if this is either too much, or could benefit from more or a different presentation, so I'd be happy for feedback :)

@yeya24
Copy link
Contributor

yeya24 commented Mar 26, 2024

#3221
I am wondering if we can finally close this?

And maybe this one #3219

@douglascamata
Copy link
Contributor

#3221 can 100% be closed. I'll do it.

#3219 not yet, as most of its stats are per block stream. We are discussing block stream UI improvements on #7237.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants