-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent flush entry followed delete operations #411
Labels
Comments
duplicate for #363 |
@hilikspdb pls approve |
ayulas
changed the title
prevent flush delete entry
prevent flush entry followed delete operations
Mar 29, 2023
@erez-speedb please check - its just for the safe side. you dont have delete in db bench but to be relax:-) tnx!!! |
@erez-speedb pls run a check that nothing was effected badly with this feature in general |
Passed performance tests, no degradation |
Yuval-Ariel
pushed a commit
that referenced
this issue
Jun 12, 2023
currently during memtable flush, if a key has a match key in the delete range table and this record has no snapshot related to it, we still write it with its value to SST file. This feature keeps only the delete record and reduce SST size for later compaction.
udi-speedb
pushed a commit
that referenced
this issue
Nov 19, 2023
…y that has a delete entry (#411) currently during memtable flush, if a key has a match key in the delete range table and this record has no snapshot related to it, we still write it with its value to SST file. This feature keeps only the delete record and reduce SST size for later compaction.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
prevent flush delete entry if possible
Background:
Ceph code uses RocksDB to store MD objects and the change-log entries.
Ceph change-log entries are written once and removed when the primary receives ACK from all the replicas.
The reason Ceph uses RocksDB for the change log is to be able to share the WAL used by Ceph Metadata objects (which are using RocksDB in a more traditional way).
The operations happen together (we create/update a MD object/s with every write operation and create a temporary change-log-entry describing the operation) so Ceph code can commit the Metadata and the change-log in a single transaction to the WAL.
We would like to filter out all Objects hit by delete operations during RocksDB::Flush() and never commit them to the disk.
Ideally we should be removing Objects hit by delete operations while in the Memtable, but that is impossible in RocksDB because their Memtable are immutable.
The text was updated successfully, but these errors were encountered: