-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove single delete elements during memtable flush #363
Comments
Ayelet is opening a new pull request with a different branch and attach it to here. |
Checked branch 411-prevent-flush-single-delete-entry @ayulas should I test branch 363-remove-single-delete ? |
411 is the right one |
Background:
Ceph code uses RocksDB to store MD objects and the change-log entries.
Ceph change-log entries are written once and removed when the primary receives ACK from all the replicas.
The reason Ceph uses RocksDB for the change log is to be able to share the WAL used by Ceph Metadata objects (which are using RocksDB in a more traditional way).
The operations happen together (we create/update a MD object/s with every write operation and create a temporary change-log-entry describing the operation) so Ceph code can commit the Metadata and the change-log in a single transaction to the WAL.
Suggested change:
Take advantage of the Singleton property of the change-log-entries which are created once, never updated and eventually deleted and use RocksDB::SingleDelete() instead of RocksDB::Delete().
This by itself doesn't help since SingleDelete entries are pushed to the SST and only there they should be processed.
We would like to filter out all Objects hit by SingleDelete() during RocksDB::Flush() and never commit them to the disk.
Ideally we should be removing Objects hit by SingleDelete() while in the Memtable, but that is impossible in RocksDB because their Memtable are immutable.
The text was updated successfully, but these errors were encountered: