-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for LZ4 compression. #74
Conversation
The latest greatest lz4 versions aren't packaged by any major Linux distributions, so that's why I opted for a kind of in-tree build. |
Those are great results. :) I would prefer using lz4 as a shared library instead of static in rocksdb tree. That's how we use other compression libraries. |
Okay, I'll make the changes and get back to you. |
Need to figure out why LZ4 is so much slower when called from a shared library... |
Found the problem: https://code.google.com/p/lz4/issues/detail?id=114 Some numbers when compiling liblz4.so with -O3: Compression: snappy Compression: lz4 Compression: lz4hc Typical inputs should compress smaller with lz4hc than with lz4. This pull request is ready to merge unless you have more comments. |
@@ -19,7 +19,8 @@ | |||
# | |||
# -DLEVELDB_PLATFORM_POSIX if cstdatomic is present | |||
# -DLEVELDB_PLATFORM_NOATOMIC if it is not | |||
# -DSNAPPY if the Snappy library is present | |||
# -DSNAPPY if the Snappy library is present | |||
# -DLZ4 -DLZ4HC if the LZ4 library is present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need both defines? Is there a use case where only one will be present and rocksdb should work correctly/partially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I'll get rid of the LZ4HC one.
I just checked our internal third-party release libraries: we are able to access lz4 library in our internal release. I think with this diff, other team can benefit from lz4 seamlessly in next release. |
inline char* BZip2_Uncompress(const char* input_data, size_t input_length, | ||
int* decompress_size) { | ||
inline char *BZip2_Uncompress(const char *input_data, size_t input_length, | ||
int *decompress_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what make format
did? We keep the asterisks by the type, not the variable name, so: int* decompress_size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help me if you could make format the master branch before I retry this. make format doesn't seem to be idempotent.
Great PR, tnx! I have some comments. Can you also try running |
Think I addressed most of the issues. arc lint found 2, but it also seems my setup isn't quite right yet:
Could you add a page on the wiki on how contributors should set up arcanist and cpplint? Finally, it's probably time to make format master again. It helps to run make format a few times in a row to see the issues with it not being idempotent. |
cc @liukai on |
@alberts I think the lint issue showed up because you didn't run "arc lint" or "arc diff" in rocksdb root dir (sorry that's a known issue). Similarly make format works for uncommitted files or your last committed diff. If you want to apply |
I've rolled up the work into a single commit now. |
Support for LZ4 compression.
Summary: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions passed 96 / 100 times. ``` With the fix: all runs (tried 100, 1000, 10000) succeed. ``` $ TEST_TMPDIR=/dev/shm ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.LevelTtlCascadingCompactions --repeat=1000 [1000/1000] DBCompactionTest.LevelTtlCascadingCompactions (1895 ms) ``` Test Plan: Build: ``` COMPILE_WITH_TSAN=1 make db_compaction_test -j100 ``` Without the fix: a few runs out of 100 fail: ``` $ TEST_TMPDIR=/dev/shm KEEP_DB=1 ~/gtest-parallel/gtest-parallel ./db_compaction_test --gtest_filter=DBCompactionTest.LevelTtlCascadingCompactions --repeat=100 ... ... Note: Google Test filter = DBCompactionTest.LevelTtlCascadingCompactions [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from DBCompactionTest [ RUN ] DBCompactionTest.LevelTtlCascadingCompactions db/db_compaction_test.cc:3687: Failure Expected equality of these values: oldest_time Which is: 1580155869 level_to_files[6][0].oldest_ancester_time Which is: 1580155870 DB is still at /dev/shm//db_compaction_test_6337001442947696266 [ FAILED ] DBCompactionTest.LevelTtlCascadingCompactions (1432 ms) [----------] 1 test from DBCompactionTest (1432 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (1433 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] DBCompactionTest.LevelTtlCascadingCompactions 1 FAILED TEST [80/100] DBCompactionTest.LevelTtlCascadingCompactions returned/aborted with exit code 1 (1489 ms) [100/100] DBCompactionTest.LevelTtlCascadingCompactions (1522 ms) FAILED TESTS (4/100): 1419 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #90) 1434 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #84) 1457 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #82) 1489 ms: ./db_compaction_test DBCompactionTest.LevelTtlCascadingCompactions (try #74) Differential Revision: D19587040 Pulled By: sagar0 fbshipit-source-id: 11191ae9940837643bff47ebe18b299b4be3d950
Fix assert/data corruption when using persistent cache
Support for LZ4 compression.
CPU: 4 * Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz
Compression:
snappy: 4.928 micros/op 202905 ops/sec; 792.6 MB/s (output: 55.1%)
lz4: 5.973 micros/op 167427 ops/sec; 654.0 MB/s (output: 55.4%)
lz4hc: 65.445 micros/op 15279 ops/sec; 59.7 MB/s (output: 55.4%)
Decompression:
snappy: 0.837 micros/op 1194871 ops/sec; 4667.5 MB/s
lz4: 0.659 micros/op 1517871 ops/sec; 5929.2 MB/s
lz4hc: 0.667 micros/op 1498213 ops/sec; 5852.4 MB/s