You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, virtual sstables aren't prioritized for compaction at all (see #2892 for the simpler, non-shared case). However there's an additional dimension to consider for compacting shared virtual sstables, which is the proportion of a backing sstable that's referenced by other Pebble instances (possibly on other nodes). This proportion can be lazily updated on the marker files placed on shared storage, and on occasion Pebble can read these marker files on a sweep to update its own estimate of how much of an sstable is referenced by other nodes. A file that has a low reference proportion even when summing up reference-percentage-points across all nodes should be prioritized for compaction.
Some examples of how this could be implemented:
if two Pebbles reference the entirety of a file, 200% of the file is referenced and we can deprioritize it for compaction picking within that level, preferring other files instead
If two Pebbles reference 10% of a file each, 20% of the file is referenced in total and both Pebbles can prioritize compacting it away by boosting its overlapping size in pickCompactionSeedFile by 5x (i.e. Size/ReferencedSum) or 2.5x (i.e. the previous factor divided by 2) or so.
See the comment at #2538 (review) for more context on this issue.
The text was updated successfully, but these errors were encountered:
Currently, virtual sstables aren't prioritized for compaction at all (see #2892 for the simpler, non-shared case). However there's an additional dimension to consider for compacting shared virtual sstables, which is the proportion of a backing sstable that's referenced by other Pebble instances (possibly on other nodes). This proportion can be lazily updated on the marker files placed on shared storage, and on occasion Pebble can read these marker files on a sweep to update its own estimate of how much of an sstable is referenced by other nodes. A file that has a low reference proportion even when summing up reference-percentage-points across all nodes should be prioritized for compaction.
Some examples of how this could be implemented:
pickCompactionSeedFile
by 5x (i.e.Size/ReferencedSum
) or 2.5x (i.e. the previous factor divided by 2) or so.See the comment at #2538 (review) for more context on this issue.
The text was updated successfully, but these errors were encountered: