feat(v2): metadata string interning #3744
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The change implements string interning in the metadata model (API) and in the index persistence layer. It introduces an option to store sharable shard-level assets in the index.
This change allows the index snapshot size to decrease by 5-10 times. In an environment with 1-2K services and the default segmentation and compaction parameters, we can now expect approximately 650-800KB of data to be added to the index daily per shard. In a deployment with 128 shards (roughly corresponding to 1GB/s of ingested traffic), the snapshot size will accumulate to 2.5-3GB of data (at rest).
Future work should focus on index snapshot compaction. Currently, the snapshot contains all partitions, including ones that are effectively immutable (due to ingestion window restrictions). We should exploit this fact. This is the reason why I changed the way partitioning is handled and removed the configuration option that controls the partition duration: it will be much more difficult to handle if partition are overlapping. Instead, these should be "immutable" – once created, a partition includes all the entries with corresponding timestamps, regardless of the current partition duration.
In the screenshot, you can see metrics from the dev environment with 12 shards; the deployment is 3 days old: