-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve throughput #59
base: master
Are you sure you want to change the base?
Conversation
f022b8b
to
8b42101
Compare
The results look promising .. |
3d6007a
to
bdeec3b
Compare
Codecov Report
@@ Coverage Diff @@
## master #59 +/- ##
============================================
- Coverage 89.69% 89.02% -0.67%
- Complexity 177 181 +4
============================================
Files 22 23 +1
Lines 679 711 +32
Branches 51 53 +2
============================================
+ Hits 609 633 +24
- Misses 47 54 +7
- Partials 23 24 +1 |
4945980
to
fb9843f
Compare
fb9843f
to
051c5d5
Compare
Codecov Report
@@ Coverage Diff @@
## master #59 +/- ##
============================================
- Coverage 90.08% 89.62% -0.47%
Complexity 230 230
============================================
Files 25 26 +1
Lines 908 925 +17
Branches 65 65
============================================
+ Hits 818 829 +11
- Misses 59 65 +6
Partials 31 31 |
eed0043
to
cc47f5c
Compare
I don't remember, but did you also try separating out the read and writes into two different |
I remember trying on earlier versions of DBeam to have two phases: read JDBC and write to Avro. Problem was that Beam was waiting for the read bundle to complete, serialize and then write to Avro, which was very inefficient. If we found a way to "stream" from different |
cc47f5c
to
e3ac7c9
Compare
thanks, I remember now. |
e3ac7c9
to
e8147a9
Compare
ef93de8
to
cc3d674
Compare
e8147a9
to
466d7c4
Compare
7190b31
to
92e2cf2
Compare
92e2cf2
to
84888b8
Compare
84888b8
to
cd475d8
Compare
cd475d8
to
e52e072
Compare
e52e072
to
1fc7071
Compare
directBinaryEncoder
, write usingappendEncoded()
. This avoids a bit of copying bytes between buffers.BlockingQueue
to asynchronously read from JDBC and write to file.Early experiments show that it can improve throughput by ~ 15 ~ 30 %.
master
/#60 (https://travis-ci.org/spotify/dbeam/builds/520639708#L1545):#61 (just encode to binary, no multi threading) (https://travis-ci.org/spotify/dbeam/builds/520647554#L1524):
This PR (https://travis-ci.org/spotify/dbeam/builds/520648264#L1539):