-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the write buffer manager to JNI #4
Add the write buffer manager to JNI #4
Conversation
Summary: Allow rocks java to explicitly create WriteBufferManager by plumbing it to the native code through JNI. Pull Request resolved: facebook/rocksdb#4492 Differential Revision: D10428506 Pulled By: sagar0 fbshipit-source-id: cd9dd8c2ef745a0303416b44e2080547bdcca1fd
Summary: 1. `WriteBufferManager` should have a reference alive in Java side through `Options`/`DBOptions` otherwise, if it's GC'ed at java side, native side can seg fault. 2. native method `setWriteBufferManager()` in `DBOptions.java` doesn't have it's jni method invocation in rocksdbjni which is added in this PR 3. `DBOptionsTest.java` is referencing object of `Options`. Instead it should be testing against `DBOptions`. Seems like a copy paste error. 4. Add a getter for WriteBufferManager. Pull Request resolved: facebook/rocksdb#4579 Differential Revision: D10561150 Pulled By: sagar0 fbshipit-source-id: 139a15c7f051a9f77b4200215b88267b48fbc487
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @mikekap . The changes LGTM.
@pnowojski @Myasuka could you also take a look here when time allows? Thanks.
LGTM, if we want to enable the 2nd feature of cost memory used in memtable to block cache, I think we'd better also pick rocksdb-PR4695. |
I'm not sure if I can add anything when reviewing this PR (I'm completely unfamiliar with this code base :( ) . Maybe @azagrebin or @tillrohrmann would like to take a look here? |
Summary: WriteBufferManger is not invoked when allocating memory for memtable if the limit is not set even if a cache is passed. It is inconsistent from the comment syas. Fix it. Pull Request resolved: facebook/rocksdb#4695 Differential Revision: D13112722 Pulled By: siying fbshipit-source-id: 0b27eef63867f679cd06033ea56907c0569597f4
@Myasuka added the extra cherry-pick. |
Thanks for the update @mikekap . We (@azagrebin and I) need some offline discussion around how to release a new frocksdb version and will get back to this ASAP. |
If someone could build a jar out of this for me that I could test I am happy to play around with the WriteBufferManager :) Unfortunately I failed to set up a build environment for this... |
…_family_test (#4474) Summary: this should fix the current failing TSAN jobs: The callstack for TSAN: > WARNING: ThreadSanitizer: data race (pid=87440) Read of size 8 at 0x7d580000fce0 by thread T22 (mutexes: write M548703): #0 rocksdb::InternalStats::DumpCFStatsNoFileHistogram(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:1204 (column_family_test+0x00000080eca7) #1 rocksdb::InternalStats::DumpCFStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:1169 (column_family_test+0x0000008106d0) #2 rocksdb::InternalStats::HandleCFStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, rocksdb::Slice) db/internal_stats.cc:578 (column_family_test+0x000000810720) ververica#3 rocksdb::InternalStats::GetStringProperty(rocksdb::DBPropertyInfo const&, rocksdb::Slice const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*) db/internal_stats.cc:488 (column_family_test+0x00000080670c) ververica#4 rocksdb::DBImpl::DumpStats() db/db_impl.cc:625 (column_family_test+0x00000070ce9a) > Previous write of size 8 at 0x7d580000fce0 by main thread: #0 rocksdb::InternalStats::AddCFStats(rocksdb::InternalStats::InternalCFStatsType, unsigned long) db/internal_stats.h:324 (column_family_test+0x000000693bbf) #1 rocksdb::ColumnFamilyData::RecalculateWriteStallConditions(rocksdb::MutableCFOptions const&) db/column_family.cc:818 (column_family_test+0x000000693bbf) #2 rocksdb::ColumnFamilyTest_WriteStallSingleColumnFamily_Test::TestBody() db/column_family_test.cc:2563 (column_family_test+0x0000005e5a49) Pull Request resolved: facebook/rocksdb#4474 Differential Revision: D10262099 Pulled By: miasantreble fbshipit-source-id: 1247973a3ca32e399b4575d3401dd5439c39efc5
@mikekap after some offline discussion with @azagrebin , we think it might be a better idea to bump the basic rocksdb version to 5.18.3 which includes all necessary commits required here and then build and test a new frocksdb version based on 5.18.3, to prevent any unknown issue or missing commits binding with those we're trying to backport here. Please let us know if you have different opinion. Thanks. @gyfora we will try to supply a new frocksdb try-out version based on rocksdb 5.18.3 ASAP, which should be able to resolve your problem, will let you know once we could supply that, and hopefully you can try that out and let us know how it goes in your case. Thanks. |
Sounds like a very nice approach, thanks @carp84 . I am looking forward to the the try-out version but take the time you need :) |
@carp84 that sounds even better to me - feel free to close this PR then :) |
One quick update: we found rocksdb 5.18.3 has slight performance degradation (~5%) comparing to 5.17.2 thus more investigation required. Will keep updating the progress here. |
@mikekap sorry for the late response, and here comes the latest progress: we reproduced the performance regression in 5.18.3 comparing 5.17.2 through rocksdb's native @azagrebin could you please check and help merge the PR here? Thanks. |
Hey all! Thanks! :) |
@gyfora Sorry about lacking negotiation/notification here, our bad... We're now going through the frocksdb release process (with this PR applied on top of frocksdb 5.17.2) and relative tests, including state backend benchmark, flink travis, etc. Once all tests passed and confirmed to be no problem, I believe this PR could be merged. Please also let us know if you have any suggestions/concerns, thanks! btw, this PR as well as FLINK-7289 are planned to be included into Flink 1.10 release and we will try our very best to make it. |
Bump version to io.github.myasuka frocksdbjni 5.17.2-artisans-3.0 This includes dataArtisans/frocksdb#5 and dataArtisans/frocksdb#4
Bump version to io.github.myasuka frocksdbjni 5.17.2-artisans-3.0 This includes dataArtisans/frocksdb#5 and dataArtisans/frocksdb#4
Bump version to io.github.myasuka frocksdbjni 5.17.2-artisans-3.0 This includes dataArtisans/frocksdb#5 and dataArtisans/frocksdb#4
Fix windows build bug and modify publish script
Bump version to io.github.myasuka frocksdbjni 5.17.2-artisans-3.0 This includes dataArtisans/frocksdb#5 and dataArtisans/frocksdb#4
Besides the rocksdb travis testing, @Myasuka has tested flink travis ci against his private frocksdb build and I did the state benchmark comparison and confirmed no performance regression. So I think this PR is good to go. |
Thanks for the thorough testing. |
Fix windows build bug and modify publish script
Fix windows build bug and modify publish script
Fix windows build bug and modify publish script
Summary: I must have chosen trimming before frame 8 based on assertion failures, but that trims too many frame for a general segfault. So this changes to start printing at frame 4, as in this example where I've seeded a null deref: ``` Received signal 11 (Segmentation fault) Invoking LLDB for stack trace... Process 873208 stopped * thread #1, name = 'db_stress', stop reason = signal SIGSTOP frame #0: 0x00007fb1fe8f1033 libc.so.6`__GI___wait4(pid=873478, stat_loc=0x00007fb1fb114030, options=0, usage=0x0000000000000000) at wait4.c:30:10 thread #2, name = 'rocksdb:low', stop reason = signal SIGSTOP frame #0: 0x00007fb1fe8972a1 libc.so.6`__GI___futex_abstimed_wait_cancelable64 at futex-internal.c:57:12 Executable module set to "/data/users/peterd/rocksdb/db_stress". Architecture set to: x86_64-unknown-linux-gnu. True frame #4: 0x00007fb1fe844540 libc.so.6`__restore_rt at libc_sigaction.c:13 frame #5: 0x0000000000608514 db_stress`rocksdb::StressTest::InitDb(rocksdb::SharedState*) at db_stress_test_base.cc:345:18 frame #6: 0x0000000000585d62 db_stress`rocksdb::RunStressTestImpl(rocksdb::SharedState*) at db_stress_driver.cc:84:17 frame #7: 0x000000000058dd69 db_stress`rocksdb::RunStressTest(shared=0x00006120000001c0) at db_stress_driver.cc:266:34 frame #8: 0x0000000000453b34 db_stress`rocksdb::db_stress_tool(int, char**) at db_stress_tool.cc:370:20 ... ``` Pull Request resolved: facebook/rocksdb#12101 Test Plan: manual (see above) Reviewed By: ajkr Differential Revision: D51593217 Pulled By: pdillinger fbshipit-source-id: 4a71eb8e516edbc32e682f9537bc77d073a7b4ed
This cherry-picks a4d9aa6 & 6ecd26a from upstream. The WriteBufferManager can be used to limit memory across a process, which ends up being very useful for flink.