Skip to content

Commit

Permalink
fix: Adjust Self Monitor storage size (#1482)
Browse files Browse the repository at this point in the history
  • Loading branch information
hisarbalik authored Oct 1, 2024
1 parent 976cd65 commit 88c1bbb
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions docs/contributor/arch/014-telemetry-self-monitor-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The Telemetry self-monitoring data is stored in the Prometheus TSDB, which is de
### Storage and Retention with TSDB

The TSDB storage size-based retention works as follows: It includes data blocks like the write-ahead-log (WAL), the checkpoints, the m-mapped chunks, and the persistent blocks. The TSDB counts all those data blocks to decide performing any retention.
Even if the size of all those data blocks exceeds the configured retention size, only persistence blocks are deleted because the WAL, checkpoints, and m-mapped chunks are needed for normal operation of TSDB. The WAL segments can grow up to 128MB before compacting, and Prometheus will keep at least 3 WAL files; [so-called 2/3 rules](https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/#wal-truncation). To ensure that Telemetry self-monitoring doesn't exceed the storage limit, minimum storage volume size should be calculated to be at least 3 * WAL segment size + some more space for the other data types.
Even if the size of all those data blocks exceeds the configured retention size, only persistence blocks are deleted because the WAL, checkpoints, and m-mapped chunks are needed for normal operation of TSDB. The WAL segments can grow up to 128MB before compacting, and Prometheus will keep at least 3 WAL files; [so-called 2/3 rules](https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/#wal-truncation). To ensure that Telemetry self-monitoring doesn't exceed the storage limit, minimum storage volume size should be calculated to be at least 3 * WAL segment size * 2 + some more space for the other data types.

### TSDB Storage architecture and retention

Expand All @@ -26,4 +26,4 @@ For the TSDB WAL and checkpoint architecture, see [Prometheus TSDB: WAL and Chec

## Consequences

Even though the Telemetry self-monitoring collects very little data for operation (currently, a few MBytes), the storage size must be at least 500MByte for a normal and safe operation.
Even though the Telemetry self-monitoring collects very little data for operation (currently, a few MBytes), the storage size must be at least 1000MByte for a normal and safe operation.
2 changes: 1 addition & 1 deletion internal/resources/selfmonitor/resources.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ const (
)

var (
storageVolumeSize = resource.MustParse("500Mi")
storageVolumeSize = resource.MustParse("1000Mi")
cpuRequest = resource.MustParse("0.1")
memoryRequest = resource.MustParse("50Mi")
cpuLimit = resource.MustParse("0.2")
Expand Down

0 comments on commit 88c1bbb

Please sign in to comment.