Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

defrag tool and related changes. #400

Closed
wants to merge 8 commits into from
Closed

Conversation

deepakjois
Copy link
Contributor

@deepakjois deepakjois commented Jan 16, 2018

  • Code for defrag command

  • Method to do offline GC on all files w/ discard ratio
    set to 0.0.

  • Refactoring to keep code DRY.


This change is Reviewable

@deepakjois
Copy link
Contributor Author

Still working on the LSM tree compaction part, but would appreciate a review for the offline GC which runs on all value log files w/ discard ratio set to 0.0.

@deepakjois
Copy link
Contributor Author

Now added code to do offline compaction as well.

@janardhan1993
Copy link

For offline we need to remove tombstones, i don't see any code for that.


Reviewed 4 of 4 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks broke.


Comments from Reviewable

@deepakjois
Copy link
Contributor Author

Hmm…yes. I forgot about that case. I haven’t really dug too deep into the code for the LSM stuff. The compaction part was easy to infer, but actually removing entries from the tables sounds a bit more involved. Any pointers for that?


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks broke.


Comments from Reviewable

@janardhan1993
Copy link

While compacting if the final table belongs to last level, we can skip adding all the entries to table builder for which bitDelete or expired(there should be a function for checking that) is set in the meta


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks broke.


Comments from Reviewable

Janardhan Reddy and others added 5 commits January 31, 2018 15:11
This involves a change in the way Badger stores data on disk, so
we need to increment the magic version as well.
* change the way purge works
* Stop searching as soon as value is found instead of searching all tables
* Define ErrPurged to be nil for windows/osx
* update gc stats in background
*Ensure all versions of key are always written to same table.
* add test to punch holes twice in same file and delete it later
This tool:

* Purges any older versions of keys in a Badger DB.

* Does a full GC after purging

* Finally, it compacts all levels of the LSM tree

Changes include:

* Code for defrag command

* Methods to do offline GC on all files w/ discard ratio
  set to 0.0.

* Methods to do offline compaction for all levels in LSM
  tree.

* Refactoring to keep code DRY.
@manishrjain manishrjain deleted the dj/defrag branch May 9, 2018 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants