Block store duplicates lots of data #3085
Labels
area/db-system
Related to the core system related components of the DB
perf
Performance issue or suggestion
Milestone
There is a lot of duplicate data held in the blockstore, and we can considerably shrink the size if we want to.
Note: The block store accounts for the vast majority of Defra's storage requirements. Bigger blocks also means slower writes, reads, and more network traffic.
These issues also permit structural data-divergance, where the duplication of data creates a structure where the data on the parent may diverge with the data on the child. It also prevents parents from containing more than one unique value (e.g. a composite of multiple fields, or multiple documents).
Testing also finds that property name length does impact the storage size given our current encoding, we may wish to reduce this too (or, less likely, change the encoding so field name has no impact on size).
Tasks
The text was updated successfully, but these errors were encountered: